We Scanned 5,000 Websites for AI Readiness. The Results Are Alarming.
73% of websites are invisible to AI. We scanned 5,000 sites across 14 industries and the data reveals a massive readiness gap that most businesses don't even know exists.
Founder & CEO at AgentReady
Why We Ran This Study
I spent 15 years in SEO watching the web evolve through algorithm updates, mobile-first indexing, and Core Web Vitals. Each shift rewarded the sites that prepared early and punished those that waited. When AI-powered search engines and autonomous agents started reshaping discovery in late 2025, I had a gut feeling the web wasn't ready. So we built AgentReady™ to measure it.
Between January 6 and February 28, 2026, we scanned 5,000 websites across 14 industries using the AgentReady scoring framework. We evaluated each site against 47 discrete signals grouped into six weighted categories: Bot Access & Crawlability, Structured Data & Schema, AI Protocols, Content Quality & Structure, Authority & Trust, and Technical Performance.
The goal was straightforward: establish the first industry-wide baseline for AI readiness. Not opinions. Not predictions. Data. Every number you see in this report comes directly from our scanning infrastructure, and we plan to re-run this study quarterly so we can track how the web evolves.
Methodology: How We Score AI Readiness
Each website receives a score from 0 to 100 based on six weighted categories. The weights reflect how much each category actually influences whether an AI system can discover, understand, and accurately represent your content. You can read the full breakdown in our scoring methodology docs.
We crawled each site's homepage, up to 50 interior pages, robots.txt, sitemap.xml, and checked for the presence of llms.txt, NLWeb endpoints, and MCP server declarations. Schema markup was validated against Google's structured data guidelines with additional checks for AI-relevant types. Content quality was assessed via heading structure, paragraph density, internal linking depth, and the presence of original research or cited sources.
Bot Access & Crawlability carries the highest weight at 25% because if AI agents can't reach your content, nothing else matters. Structured Data & Schema follows at 20%, then AI Protocols at 20%, Content Quality at 20%, Authority & Trust at 10%, and Technical Performance at 5%.
AgentReady Scoring Weight Distribution
Headline Findings: The AI Readiness Gap Is Real
The average AI readiness score across all 5,000 sites is 57 out of 100. That places the typical website in our C grade band, meaning it has fundamental gaps that limit AI visibility. But the average masks a more troubling distribution.
73% of websites scored below 65, the threshold we consider minimally ready for AI discovery. Only 4.2% scored above 85 (A grade). The median score was 54, pulled down by massive underperformance in AI Protocols and Structured Data.
Three findings stood out above all others. First, only 12% of sites have an llms.txt file, the most basic AI protocol signal. Second, 38% of sites actively block at least one major AI crawler in their robots.txt, often without realizing it. Third, e-commerce sites, the category with the most to gain from AI product recommendations, scored the lowest average at 42 out of 100.
Industry Breakdown: Who Leads and Who Lags
The gap between the highest and lowest scoring industries is 25 points, which is enormous when you consider that a 20-point difference can mean the difference between being cited by AI and being completely invisible.
Tech/SaaS leads at 67, driven by developer-oriented teams that adopted structured data and AI protocols early. Media & Publishing follows at 64, largely because content-heavy sites tend to have strong heading hierarchies and author attribution. Education scores 62, helped by .edu domains' inherent authority signals.
At the bottom, e-commerce averages 42, dragged down by thin product descriptions, missing schema, and over-reliance on JavaScript rendering. Healthcare sits at 47, where regulatory caution leads to aggressive bot blocking. Real estate scores 48, plagued by IDX iframe content that AI crawlers can't parse.
The complete guide to AI-ready websites covers specific strategies for each industry tier.
Average AI Readiness Score by Industry
Grade Distribution: A Sea of C's and D's
We assign letter grades on the following scale: A (85-100), B (70-84), C (55-69), D (40-54), F (0-39). The distribution tells a clear story.
Only 4.2% of sites earn an A grade. These are almost exclusively tech companies, major publishers, and enterprise SaaS platforms with dedicated technical SEO teams. 14.8% earn a B, mostly sites that had strong traditional SEO foundations and happened to benefit from existing schema markup.
The bulk of the web sits in C and D territory. 38% score C and 31% score D. These sites are partially visible to AI but have critical gaps, typically in AI protocols and structured data. The remaining 12% score F, and these sites are essentially invisible to AI agents. They block crawlers, lack any schema markup, have no author attribution, and often serve content primarily through JavaScript rendering.
- A Grade (85-100): 4.2% of sites -- fully AI-optimized, protocol-ready
- B Grade (70-84): 14.8% of sites -- strong foundation, minor protocol gaps
- C Grade (55-69): 38.0% of sites -- partially visible, key categories missing
- D Grade (40-54): 31.0% of sites -- significant gaps in most categories
- F Grade (0-39): 12.0% of sites -- effectively invisible to AI agents
The 5 Most Common AI Readiness Failures
Across all 5,000 sites, five issues appeared with alarming frequency. Fixing just these five would move the average score from 57 to an estimated 72.
1. Missing or incomplete Schema markup (78% of sites). Most sites either have no structured data at all or only implement basic Organization schema. Product, Article, FAQ, and HowTo schemas are absent on the vast majority of pages that would benefit from them. Our schema markup guide covers the types that matter most.
2. No llms.txt file (88% of sites). This simple text file tells AI systems what your site is about and which pages matter most. It takes 10 minutes to create and has an outsized impact on AI discoverability. Learn how in our llms.txt tutorial.
3. Blocking AI crawlers in robots.txt (38% of sites). Many sites copied robots.txt configurations that block GPTBot, ClaudeBot, or other AI crawlers, often inherited from templates or security plugins.
4. No author attribution (64% of sites). Pages without clear authorship lose significant trust signals. AI systems increasingly weight E-E-A-T factors when deciding which sources to cite.
5. Thin or poorly structured content (52% of sites). Pages with fewer than 300 words, missing H2/H3 hierarchies, or walls of unbroken text score poorly on Content Quality.
Frequently Asked Questions
How were the 5,000 websites selected?
We selected sites across 14 industries using a stratified random sample from the Tranco top 100K list, supplemented with mid-market sites from industry directories. Each industry includes between 280 and 450 sites to ensure statistical significance.
What is a good AI readiness score?
A score of 70+ (B grade) means your site has a solid foundation for AI visibility. A score of 85+ (A grade) means you are fully optimized. The current average is 57, so any score above 65 puts you ahead of most of the web.
How often will this study be updated?
We plan to re-run the full scan quarterly. The next update is scheduled for June 2026. Subscribers to our newsletter will receive early access to each new report.
Check Your AI Readiness Score
Free scan. No signup required. See how AI engines like ChatGPT, Perplexity, and Google AI view your website.
Scan Your Site FreeSEO veteran with 15+ years leading digital performance at 888 Holdings, Catena Media, Betsson Group, and Evolution. Now building the AI readiness standard for the web.
Related Articles
The AI Readiness Report: E-Commerce Edition
E-commerce sites score the lowest of any industry at 42/100. We break down why, which CMS platforms perform best, and the 3 fixes that can move a store from 42 to 65.
Data & ResearchWhich CMS Is Most AI-Ready? We Analyzed the Data.
We analyzed AI readiness scores across WordPress, Shopify, Wix, Squarespace, Next.js, Webflow, and custom builds. The gap between best and worst is 36 points.
Data & ResearchAI Protocol Adoption: Where the Web Stands in March 2026
We measured adoption rates for llms.txt, NLWeb, and MCP across 5,000 websites. The numbers are tiny but growing fast, with llms.txt doubling since December 2025.