BlogData & Research

Data & ResearchMarch 9, 202614 min

984 Sites, 12 Industries: What Actually Predicts AI Citations?

We built a scoring framework, ran it on 984 websites, then checked which ones actually get cited by AI. The correlation is real — but only in specific industries. Here's what we found.

Eitan Gorodetsky

Founder & CEO at AgentReady

The Honest Question We Had to Answer

When we built the AgentReady scoring framework, we made a bet: that the factors we measure — schema markup, bot access, content quality, AI protocols, authority signals, crawl efficiency — actually correlate with whether AI systems cite a website. That's the whole thesis. And we needed to test it.

So we ran a proper study. 984 websites across 12 industries. We scored each one, then used a panel of 87 AI-generated queries to check which sites actually appeared in AI responses (ChatGPT, Perplexity, and Claude). We calculated Spearman rank correlation coefficients between readiness scores and citation rates for each industry.

The results were more nuanced than we expected — and we're going to share them with complete transparency, including the parts that complicated our thesis.

984

Websites scored and tested for AI citation rates

The Overall Finding: r=0.025 (Weak, But Not the Full Story)

Across all 984 sites and all 12 industries combined, the Spearman correlation between AI readiness score and citation rate is r=0.025. That's statistically negligible.

If you stopped there, you'd conclude our scoring framework predicts nothing. But stopping there would be wrong — because the aggregate masks dramatic industry-level variation. When we broke out the data by sector, the picture changed completely.

The key insight from our v3 study is this: current AI citations are driven primarily by brand recognition (how well-known a site is from training data), not by technical readiness. The more established a brand, the more AI systems cite it — regardless of schema markup, llms.txt, or crawl optimization. This is the Brand Fame Effect, and it explains why Wikipedia, WebMD, and Investopedia get cited constantly despite not being technically 'AI-ready' by our criteria.

But brand recognition isn't the whole story either.

r=0.025

Overall correlation across all 12 industries (negligible)

Where Correlation Is Real: The YMYL Industries

When we isolated YMYL (Your Money or Your Life) industries — sectors where AI systems apply extra source quality scrutiny — the correlations jumped dramatically.

Healthcare: ρ=0.72 — By far the strongest. Medical sites with strong E-E-A-T signals (author credentials, clinical citations, schema markup) are cited significantly more often than technically weaker peers. AI systems are most cautious about medical misinformation, so they reward trust signals.

Government: ρ=0.40 — Second highest. Government sites with clear entity signals, structured service data, and accessible content are cited more reliably. Regulatory authority matters.

Education: ρ=0.35 — Strong in the .edu domain, where institutional credibility intersects with technical readiness.

Insurance: ρ=0.33 — Similar pattern to healthcare, with policy structure and trust signals driving citation rates.

Finance: ρ=0.20 — Meaningful but more modest, reflecting the complexity of financial information and AI systems' caution about specific recommendations.

For all other industries (tech, e-commerce, media, travel, etc.), correlations were below r=0.15 — weak enough that brand recognition likely explains the residual.

Spearman Correlation: AI Readiness Score vs. Citation Rate

The 95+ Score Premium: A Signal Worth Noting

Even with a weak overall correlation, we found one consistent pattern across all industries: sites scoring 95+ were cited 59% of the time, compared to 41% for sites scoring below 50. That's a meaningful gap.

This likely reflects a floor effect: once a site is technically well-optimized, brand recognition amplifies citation rates. The highest-scoring sites tend to be well-known brands that have also invested in technical optimization — a compound advantage.

For less well-known brands, the implication is clear: technical readiness alone won't overcome brand anonymity in AI citations today. But as AI systems shift from training-data citations to real-time crawl-based citation (already happening with Perplexity and Google AI Overviews), technical readiness will become the differentiator.

59% vs 41%

Citation rate for 95+ scorers vs. sub-50 scorers

Why YMYL Industries Are the Exception

The strong correlations in healthcare, government, and education aren't accidental. AI systems apply a quality filter called E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) when generating responses in high-stakes domains. That filter is more legible for AI systems when sites have structured signals — author credentials, institutional affiliations, citations, schema markup.

In other words: in YMYL industries, technical readiness acts as a proxy for credibility. Schema markup isn't just a crawlability signal — it's a trust signal. E-E-A-T signals aren't just SEO factors — they're the machine-readable version of authority that AI citation systems rely on.

For brands in healthcare, government, education, insurance, and finance: technical AI readiness is already delivering measurable citation advantages. The correlation data makes this actionable, not hypothetical.

What This Means for Your AI Strategy

The study doesn't invalidate technical AI readiness — it contextualizes it. Here's what we conclude:

If you're in a YMYL industry: AI readiness improvements have a direct, measurable impact on citation rates today. Healthcare, government, education, insurance, and finance organizations should treat AI readiness as a top-priority initiative.

If you're in a non-YMYL industry: Technical readiness is about positioning for the future, not capturing citations today. As AI systems shift to real-time crawl-based citation, brand recognition will matter less and technical signals will matter more. Sites investing now will have a structural advantage when that shift accelerates.

For everyone: The 95+ score premium suggests that once brand recognition and technical optimization combine, citation rates jump meaningfully. Smaller brands that achieve technical excellence may punch above their weight — especially in industries where dominant brands don't maintain clean technical signals.

You can explore the full dataset and download the raw correlation data on our /research page.

Frequently Asked Questions

How was the citation check conducted?

We used a panel of 87 industry-specific queries across ChatGPT, Perplexity, and Claude. For each site in the study, we checked whether the domain or brand was mentioned in AI responses to relevant queries. Citation rate is defined as the percentage of applicable queries where the site was mentioned.

Why is the overall correlation so weak?

Current AI citations are heavily influenced by brand recognition from training data — how well-known a site is. Well-known brands get cited regardless of technical readiness. The correlation becomes meaningful only in YMYL industries, where AI systems apply stricter source quality filters that technical signals help satisfy.

Does this mean AI readiness scoring is useless for non-YMYL sites?

No. As AI systems shift from training-data citations to real-time crawl-based citation (already underway with Perplexity and Google AI Overviews), technical readiness will increasingly drive visibility for all industries. Sites investing in technical AI readiness now are positioning for that shift — not optimizing for the current state.

Check Your AI Readiness Score

Free scan. No signup required. See how AI engines like ChatGPT, Perplexity, and Google AI view your website.

Scan Your Site Free

Eitan GorodetskyFounder & CEO

SEO veteran with 15+ years leading digital performance at 888 Holdings, Catena Media, Betsson Group, and Evolution. Now building the AI readiness standard for the web.

15+ Years in SEO & Digital PerformanceDirector of Digital Performance at Betsson Group (20+ brands)Conference Speaker: SIGMA, SBC, iGaming NEXT

LinkedIn Website

Data & Research

Spearman Correlation: AI Readiness Score vs. Citation Rate

The 95+ Score Premium: A Signal Worth Noting

59% vs 41%

Citation rate for 95+ scorers vs. sub-50 scorers

Why YMYL Industries Are the Exception

What This Means for Your AI Strategy

The study doesn't invalidate technical AI readiness — it contextualizes it. Here's what we conclude:

You can explore the full dataset and download the raw correlation data on our /research page.

Frequently Asked Questions

984 Sites, 12 Industries: What Actually Predicts AI Citations?

The Honest Question We Had to Answer

The Overall Finding: r=0.025 (Weak, But Not the Full Story)

Where Correlation Is Real: The YMYL Industries

Spearman Correlation: AI Readiness Score vs. Citation Rate

The 95+ Score Premium: A Signal Worth Noting

Why YMYL Industries Are the Exception

What This Means for Your AI Strategy

Frequently Asked Questions

How was the citation check conducted?

Why is the overall correlation so weak?

Does this mean AI readiness scoring is useless for non-YMYL sites?

Check Your AI Readiness Score

Related Articles

We Scanned 5,000 Websites for AI Readiness. The Results Are Alarming.

The Brand Fame Paradox: Why Famous Sites Get AI Citations Without Being Ready

The Complete Guide to Making Your Website AI-Ready in 2026

Related Documentation

984 Sites, 12 Industries: What Actually Predicts AI Citations?

The Honest Question We Had to Answer

The Overall Finding: r=0.025 (Weak, But Not the Full Story)

Where Correlation Is Real: The YMYL Industries

Spearman Correlation: AI Readiness Score vs. Citation Rate

The 95+ Score Premium: A Signal Worth Noting

Why YMYL Industries Are the Exception

What This Means for Your AI Strategy

Frequently Asked Questions

How was the citation check conducted?

Why is the overall correlation so weak?

Does this mean AI readiness scoring is useless for non-YMYL sites?

Check Your AI Readiness Score

Related Articles

We Scanned 5,000 Websites for AI Readiness. The Results Are Alarming.

The Brand Fame Paradox: Why Famous Sites Get AI Citations Without Being Ready

The Complete Guide to Making Your Website AI-Ready in 2026

Related Documentation