robots.txt for AI Crawlers

Direct answer: Your robots policy should allow legitimate search and assistant fetch bots on public content while blocking admin, previews, staging, and unpublished drafts.

Machine read

Primary entity

AI crawler policy

Extractable answer

High

Citation potential

Medium

Main issue

Teams block or allow bots too broadly because they do not distinguish fetch bots from training bots

Human read

Good robots policy is operational hygiene, not ideology. You want visibility on public pages and restraint on everything else.

What to change

Explicitly disallow admin, preview, staging, and draft paths.
Document separate decisions for search bots, assistant fetch bots, and training-oriented bots.
Track crawler behavior in Cloudflare so policy decisions can be based on evidence.

Hidden failure mode: One blanket block kills high-value assistant referrals along with low-value bot traffic.

Noise check: robots.txt is not a trust substitute for weak information architecture or poor page quality.

The playbook

Owner: Platform operations
Effort: Half a sprint
Expected outcome: Clear crawler access rules with fewer accidental visibility losses.

FAQ

Should every AI-related bot be blocked?

No. Search and assistant fetch bots can provide referral and citation value on public pages.

What must stay blocked?

Admin routes, draft paths, preview URLs, and staging environments should stay out of public crawl surfaces.

Sources:

OpenAI search help Anthropic crawler guidance

The wrong robots file can quietly erase distribution. This is one of the few settings where a small mistake can undo a lot of editorial work.

Machine read

Human read

What to change

The playbook

FAQ

Related reading

Citation Eligibility

Topical Authority for AI Search

Entity Clarity Audit