Robots.txt Generator
Build a robots.txt file with common crawl rules.
Presets replace the builder below. Edit any field afterwards to fine-tune.
No rules — none added.
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Sitemap: https://example.com/sitemap.xmlSave this file as robots.txt in your site root so it is served at https://yourdomain.com/robots.txt.
How to use Robots.txt Generator
What this tool does
This robots.txt generator turns a visual builder into a correctly formatted
robots.txt file. Instead of memorizing the syntax, you add one or more
user-agent groups, list the paths each crawler should be allowed or disallowed
from fetching, optionally set a crawl delay, and add the URLs of your XML
sitemaps. The tool assembles a valid file in real time and lets you copy it or
download it ready to drop into your site root. Quick-start presets cover the
most common configurations — allow everything, block everything, block AI
crawlers, and a typical WordPress setup — so you can start from a sensible base
and adjust from there.
A robots.txt file is the first thing most search engine crawlers request when
they visit a site. It uses the Robots Exclusion Protocol, a simple plain-text
format that every major crawler understands. Getting it right matters because a
single misplaced character can either expose pages you wanted kept out of the
crawl or, far worse, block your entire site from being indexed.
Why it matters for SEO
robots.txt controls your crawl budget — the finite amount of crawling a search
engine will do on your site in a given period. By disallowing low-value paths
such as internal search results, faceted-navigation URLs, staging areas, and
admin endpoints, you steer crawlers toward the pages that actually earn rankings
and traffic. On large sites this directly affects how quickly new and updated
content is discovered.
The file is also where you point crawlers at your XML sitemap. Listing the
sitemap URL here means any crawler that finds your robots.txt — which is
effectively all of them — also learns where your complete URL list lives, even
if you never submit it manually. That speeds up discovery of new pages.
Crucially, robots.txt is a crawl directive, not an indexing directive. It
tells a crawler whether it may fetch a URL; it does not tell a search engine
whether to show that URL in results. A page blocked by robots.txt can still
be indexed from inbound links, appearing as a bare URL with no snippet. If your
goal is to remove a page from search results, the page must remain crawlable so
the engine can see a noindex directive — blocking it in robots.txt would
hide that very instruction.
How to use it
- Pick a quick-start preset, or start from the default group, to populate the builder.
- For each user-agent group, set the
User-agenttoken (*matches all crawlers) and addDisallowpaths for anything that should not be crawled. - Add
Allowpaths to carve out exceptions inside a disallowed directory — for example, allowing/wp-admin/admin-ajax.phpwhile disallowing/wp-admin/. - Optionally set a
Crawl-delayin seconds for crawlers that respect it. - Add the absolute URL of each XML sitemap in the Sitemap section.
- Copy the generated text or download
robots.txt, then upload it to your site root so it is reachable athttps://yourdomain.com/robots.txt.
SEO best practices
Keep the file lean and intentional — only block paths you have a real reason to
block. Always disallow with a path that starts with /, and remember that rules
are prefix matches: Disallow: /tmp blocks /tmp/ and /tmpfile.html alike.
Use the wildcard * and the end-of-URL anchor $ for pattern matching when a
crawler supports them. Reference every sitemap you maintain, and after
publishing the file, validate it with Google Search Console’s robots.txt report.
When you launch a new site, double-check that you have not left a development
Disallow: / in place — it is the single most common cause of a site
disappearing from search.
Common mistakes to avoid
The most damaging mistake is blocking CSS and JavaScript files that the page
needs to render; search engines render pages like browsers do, and a page they
cannot render fully may be judged poorly. Do not rely on robots.txt to hide
confidential content — the file is public and lists exactly what you are trying
to conceal. Avoid blocking a URL you also want de-indexed, because the crawler
then never sees the noindex tag. Finally, mind capitalization and spacing:
directives are case-sensitive in their paths, and a stray space can void a rule.
Privacy & your data
This generator is completely client-side. Every user-agent group, path rule,
crawl-delay value, and sitemap URL you enter is processed by JavaScript running
in your own browser. Nothing is transmitted to a server, nothing is stored
between visits, and nothing you type is logged or tracked. The generated
robots.txt lives only in the page until you copy or download it, and it is
gone the moment you close the tab. That makes the tool safe to use even while
planning the structure of an unreleased site.
Frequently asked questions
Where does robots.txt need to live?
Does Disallow stop a page from appearing in Google?
Is robots.txt a security feature?
What is Crawl-delay and does Google obey it?
Is the robots.txt I build here uploaded anywhere?
Related tools
Sitemap.xml Generator
Generate an XML sitemap from a list of URLs.
Meta Tag Generator
Generate SEO meta title, description, and keyword tags.
Canonical URL Builder
Build canonical URL tags to avoid duplicate content.
Hreflang Tag Generator
Generate hreflang tags for multilingual SEO.
Schema Markup Generator
Generate JSON-LD structured data markup.
URL Parser
Break a URL into its component parts.