ToolJutsu
All tools
Web & SEO Tools

Robots.txt Generator

Build a robots.txt file with common crawl rules.

Quick-start presets

Presets replace the builder below. Edit any field afterwards to fine-tune.

Disallow pathsPaths crawlers should not request, e.g. /admin/
Allow pathsCarve-outs inside a disallowed path, e.g. /admin/public/

No rules — none added.

Sitemap URLsAbsolute URLs to your XML sitemap(s). These apply to all crawlers.
Generated robots.txt
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/

Sitemap: https://example.com/sitemap.xml

Save this file as robots.txt in your site root so it is served at https://yourdomain.com/robots.txt.

Processed on your device. We never see your files.

How to use Robots.txt Generator

What this tool does

This robots.txt generator turns a visual builder into a correctly formatted robots.txt file. Instead of memorizing the syntax, you add one or more user-agent groups, list the paths each crawler should be allowed or disallowed from fetching, optionally set a crawl delay, and add the URLs of your XML sitemaps. The tool assembles a valid file in real time and lets you copy it or download it ready to drop into your site root. Quick-start presets cover the most common configurations — allow everything, block everything, block AI crawlers, and a typical WordPress setup — so you can start from a sensible base and adjust from there.

A robots.txt file is the first thing most search engine crawlers request when they visit a site. It uses the Robots Exclusion Protocol, a simple plain-text format that every major crawler understands. Getting it right matters because a single misplaced character can either expose pages you wanted kept out of the crawl or, far worse, block your entire site from being indexed.

Why it matters for SEO

robots.txt controls your crawl budget — the finite amount of crawling a search engine will do on your site in a given period. By disallowing low-value paths such as internal search results, faceted-navigation URLs, staging areas, and admin endpoints, you steer crawlers toward the pages that actually earn rankings and traffic. On large sites this directly affects how quickly new and updated content is discovered.

The file is also where you point crawlers at your XML sitemap. Listing the sitemap URL here means any crawler that finds your robots.txt — which is effectively all of them — also learns where your complete URL list lives, even if you never submit it manually. That speeds up discovery of new pages.

Crucially, robots.txt is a crawl directive, not an indexing directive. It tells a crawler whether it may fetch a URL; it does not tell a search engine whether to show that URL in results. A page blocked by robots.txt can still be indexed from inbound links, appearing as a bare URL with no snippet. If your goal is to remove a page from search results, the page must remain crawlable so the engine can see a noindex directive — blocking it in robots.txt would hide that very instruction.

How to use it

  1. Pick a quick-start preset, or start from the default group, to populate the builder.
  2. For each user-agent group, set the User-agent token (* matches all crawlers) and add Disallow paths for anything that should not be crawled.
  3. Add Allow paths to carve out exceptions inside a disallowed directory — for example, allowing /wp-admin/admin-ajax.php while disallowing /wp-admin/.
  4. Optionally set a Crawl-delay in seconds for crawlers that respect it.
  5. Add the absolute URL of each XML sitemap in the Sitemap section.
  6. Copy the generated text or download robots.txt, then upload it to your site root so it is reachable at https://yourdomain.com/robots.txt.

SEO best practices

Keep the file lean and intentional — only block paths you have a real reason to block. Always disallow with a path that starts with /, and remember that rules are prefix matches: Disallow: /tmp blocks /tmp/ and /tmpfile.html alike. Use the wildcard * and the end-of-URL anchor $ for pattern matching when a crawler supports them. Reference every sitemap you maintain, and after publishing the file, validate it with Google Search Console’s robots.txt report. When you launch a new site, double-check that you have not left a development Disallow: / in place — it is the single most common cause of a site disappearing from search.

Common mistakes to avoid

The most damaging mistake is blocking CSS and JavaScript files that the page needs to render; search engines render pages like browsers do, and a page they cannot render fully may be judged poorly. Do not rely on robots.txt to hide confidential content — the file is public and lists exactly what you are trying to conceal. Avoid blocking a URL you also want de-indexed, because the crawler then never sees the noindex tag. Finally, mind capitalization and spacing: directives are case-sensitive in their paths, and a stray space can void a rule.

Privacy & your data

This generator is completely client-side. Every user-agent group, path rule, crawl-delay value, and sitemap URL you enter is processed by JavaScript running in your own browser. Nothing is transmitted to a server, nothing is stored between visits, and nothing you type is logged or tracked. The generated robots.txt lives only in the page until you copy or download it, and it is gone the moment you close the tab. That makes the tool safe to use even while planning the structure of an unreleased site.

Frequently asked questions

Where does robots.txt need to live?
It must sit at the root of your domain and be served at exactly /robots.txt — for example https://example.com/robots.txt. Crawlers only look in that one location, so a file placed in a subfolder or under a different name is ignored. Each subdomain and protocol is treated separately, so example.com and blog.example.com each need their own file.
Does Disallow stop a page from appearing in Google?
No. Disallow tells well-behaved crawlers not to fetch a page, but a blocked URL can still be indexed if other pages link to it — Google may show it with no description. To keep a page out of search results, allow it to be crawled and add a noindex meta tag or X-Robots-Tag header, or protect it behind authentication.
Is robots.txt a security feature?
No, and you should never treat it as one. The file is public, so listing a Disallow path actually advertises its existence to anyone who reads it. Malicious bots ignore robots.txt entirely. Anything that must stay private needs real access control such as a login, IP restriction, or server-side authorization.
What is Crawl-delay and does Google obey it?
Crawl-delay asks a crawler to wait a number of seconds between requests to reduce server load. Bing and several other crawlers honor it, but Googlebot does not — you control Google's crawl rate through Search Console settings instead. It is still useful for limiting aggressive secondary crawlers.
Is the robots.txt I build here uploaded anywhere?
No. The generator runs entirely in your browser using JavaScript. The user-agent groups, paths, and sitemap URLs you enter are never sent to a server, never stored, and never logged. The generated file exists only on your device until you copy or download it.

Related tools