🤖

Free SEO Tool

Robots.txt Generator

Build a robots.txt file with a visual rule editor. Apply presets for allowing all bots, blocking AI training crawlers, or protecting private site sections. Live preview and instant copy — no coding required.

Quick presets

Crawler rules

Crawl-delay:
robots.txt
User-agent: *
Allow: /

How to deploy this file

  1. 1. Copy the file contents above
  2. 2. Save as robots.txt (no extension)
  3. 3. Upload to your website root: yoursite.com/robots.txt
  4. 4. Test in Google Search Console → Settings → Crawling

1

User-agent blocks

1

Total rules

Common web crawlers reference

User-agentOwnerPurposeBlock?
GooglebotGoogleGoogle Search indexingNo
BingbotMicrosoftBing Search indexingNo
Google-ExtendedGoogleAI training (Gemini)Optional
GPTBotOpenAIAI training (GPT models)Optional
ChatGPT-UserOpenAIChatGPT browsing pluginOptional
ClaudeBotAnthropicAI training (Claude)Optional
CCBotCommon CrawlOpen data (used in AI training)Optional
PerplexityBotPerplexity AIPerplexity search/AI answersOptional
BytespiderByteDanceTikTok/AI data collectionRecommended
AmazonbotAmazonAlexa/AI trainingOptional
AhrefsBotAhrefsSEO backlink indexOptional
SemrushBotSemrushSEO data collectionOptional

Frequently asked questions

What is a robots.txt file?

A robots.txt file is a plain-text file placed at the root of a website (e.g. yoursite.com/robots.txt) that tells web crawlers which pages or sections they are and are not allowed to access. It uses the Robots Exclusion Protocol — a set of simple directives like "User-agent" (which crawler), "Disallow" (paths to block), "Allow" (paths to permit), and "Sitemap" (your XML sitemap URL). Robots.txt is one of the first files search engine crawlers check when visiting a site.

Should I block AI crawlers in my robots.txt?

Whether to block AI training crawlers depends on your priorities. Blocking them prevents your content from being used to train AI language models (like GPT and Gemini), which some publishers and content creators prefer for copyright and commercial reasons. However, some AI systems (like Bing's AI and Google's AI Overviews) use crawlers that also power their search indexing — blocking them may reduce your visibility in those features. The decision is yours: the tool provides ready-made presets for blocking specific AI crawlers while leaving major search engine bots unaffected.

Does robots.txt prevent pages from being indexed?

Robots.txt prevents crawlers from accessing your pages, but it does not guarantee those pages won't appear in search results. Google can still index a page it hasn't crawled if other sites link to it — it will just show minimal information without a snippet. To completely prevent indexing, use a "noindex" meta tag or X-Robots-Tag HTTP header instead of (or in addition to) robots.txt.

What AI crawlers are included in the blocking preset?

The "Block AI Crawlers" preset blocks the following known AI training bots: GPTBot (OpenAI), ChatGPT-User (OpenAI), ClaudeBot (Anthropic), anthropic-ai, CCBot (Common Crawl, used by many AI companies), Google-Extended (Google AI training, separate from Googlebot), PerplexityBot, Bytespider (ByteDance/TikTok), Meta-ExternalAgent (Meta), and Amazonbot. Googlebot and Bingbot are NOT included — these bots also power search indexing and should typically be allowed.

What happens if I block Googlebot?

Blocking Googlebot entirely with "Disallow: /" will prevent Google from crawling your site, which will eventually remove your pages from Google search results as they become stale and unapproved. This is almost never intentional. Be very careful when configuring User-agent rules — always explicitly target only the bots you want to block. The generator uses named User-agent directives for each bot rather than a catch-all wildcard, making it safer to use.

How do I verify my robots.txt is working correctly?

After uploading your robots.txt to your site's root directory, you can verify it using Google Search Console's robots.txt tester (found under Settings > Crawling). Enter specific URLs to test whether Googlebot can access them under your current rules. You can also visit yoursite.com/robots.txt directly in a browser to confirm the file is live and readable. Allow 24–48 hours for crawlers to re-read an updated robots.txt.