llms.txt Adoption Audit 2026: Top 1000 Sites Crawl Report
What llms.txt Is and Why It Matters Now
llms.txt is a root-level file proposed in late 2024 and rapidly adopted in 2025-2026. Its purpose: give LLM crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) a concise content map of the site. robots.txt grants access; llms.txt says what to read, in what order, with what summary.
Crawl Method
- Sample: Tranco top 1000 (May 10, 2026 snapshot)
- Checks:
/llms.txt+/llms-full.txt+ HTTP 200 validation - Content audit: heading presence, link count, "> " summary block, robots.txt cross-check
Adoption by Category
| Category | has llms.txt | has llms-full.txt |
|---|---|---|
| SaaS / Developer Tools | 23.8% | 14.2% |
| News / Publishing | 4.1% | 0.9% |
| E-commerce | 2.7% | 0.4% |
| Education (.edu) | 9.3% | 3.1% |
| Government (.gov) | 0.6% | 0% |
| Overall average | 7.4% | 3.2% |
Format Compliance
The spec requires Markdown headings. Of the 74 files we audited, 68 are spec-compliant; 6 are plain text (LLMs deprioritize these).
# Site Name
> One-paragraph summary — core purpose and audience.
## Docs
- [Quickstart](https://...): Brief description
- [API Reference](https://...): Description
## Optional
- [Blog](https://...): Description
robots.txt Conflict Scenarios
- Scenario A — robots.txt blocks GPTBot, llms.txt exists: The crawler fetches llms.txt but cannot follow links. Symbolic inconsistency; no SEO impact but operational confusion.
- Scenario B — robots.txt permits, llms.txt absent: Default behavior. AI crawlers walk the entire site; without your summary, the crawler — not you — picks the "canonical source" pages.
- Scenario C — both aligned: Ideal. URLs in llms.txt should be allowed in robots.txt; the rest can be disallowed.
Implementation Steps
- Pick your 8-12 most important pages (services, docs, about, case studies).
- Write a one-sentence description per link. Long descriptions add noise to LLM summarization.
- Apply the Markdown template; place the site summary in the first
>block. - Treat llms.txt like sitemap.xml — keep it updated as content evolves.
Editorial Note
7.4% adoption means the window is still open. The winners aren't the late-comers but the correct ones. Outside SaaS, serious investment is rare. A spec-compliant, concise, well-linked llms.txt takes ~30 minutes; it is the single highest-ROI file for LLM visibility in the next 12 months.