AI Search Is Still SEO (Kevin Indig and AirOps Just Proved It)

The AI search panic narrative has been everywhere for the past year. Everything is different now. Traditional SEO is dead. You need an entirely new playbook. The fundamentals don’t apply anymore.

A new study from AirOps and Kevin Indig should put a lot of that to rest.

The Fan-Out Effect analyzed 16,851 queries and 353,799 pages across ChatGPT’s full retrieval pipeline. The findings are clear and the implications are direct. AI search is still SEO. The principles haven’t changed. A few specific tactics need adjusting, but anyone who told you to throw out your SEO playbook was wrong.

This note covers the findings that matter most, and validates a few things I have shared here the past few months.

Retrieval Rank Is the Whole Game

The single most important finding from the study. A page at position 1 in ChatGPT’s retrieval results has a 58% citation rate. By position 10, that drops to 14%. A 4x gap that no amount of content quality can close.

ChatGPT doesn’t pull from some magical alternative source. It runs web searches, gets back ranked results, and cites from there. The retrieval system underneath is doing the heavy lifting. If you don’t rank well in traditional search, you don’t get cited in AI search.

The study tested this against every other variable and the conclusion held. A page with perfect content relevance at rank 11 or worse got cited 21.5% of the time. A page with mediocre content relevance at rank 1 got cited 55.9%. Rank overrides content quality.

That’s the headline argument. The “AI search makes traditional SEO obsolete” narrative collapses under this finding. ChatGPT citations flow through the same retrieval mechanics that have always determined organic search visibility. Great SEO isn’t your obstacle in AI search. It’s your advantage.

Heading Match Is the Primary On-Page Lever

Last week’s note covered semantic distance and the Google patent that describes how heading structure creates semantic relationships on a page. That note explained the mechanics. This study quantifies the impact.

Pages whose headings closely match the query are cited 41% of the time. Pages with weak heading matches get cited 29% of the time. That 12-point gap holds even after controlling for retrieval rank.

The study compared heading match against every other content signal: word count, topical breadth, body copy depth, schema markup, readability. Heading structure was the strongest content predictor of citation. By a meaningful margin.

This connects directly to what last week’s note covered. Headings aren’t just keyword placement opportunities. They’re semantic containers that define what a page is about. When the container clearly maps to the query someone is asking, AI systems and traditional search engines both reward it. When the container is vague or off-topic, both penalize it.

Heading Structure Has a Sweet Spot

The study also found a sweet spot for how many subheadings to use, and a counterintuitive pattern below it.

For articles, the optimal range is 4 to 10 H2-H4 subheadings (33.2% citation rate). The strange finding: articles with 1 to 3 subheadings (28%) perform worse than articles with zero subheadings (30.1%). Half-measures are worse than no structure at all. Either commit to proper structure or don’t bother.

The sweet spot also varies by page type. Articles do best with 4 to 10 subheadings. Product pages, oddly, perform best with zero subheadings (43.2%) and worst with 21 or more (25%). The “other” bucket (forums, landing pages) tracks the article pattern.

The takeaway: don’t apply article-page heading structure to product pages. Product pages are typically focused on a single item and don’t need editorial scaffolding. Different page types have different optimal structures, and forcing the wrong structure on a page hurts more than it helps.

Domain Authority Doesn’t Translate

A few weeks ago I wrote about how Domain Authority and similar metrics get misused. This study delivers one of the most direct empirical contradictions of DA-based thinking I’ve seen.

Always-cited pages have lower DA (53) than never-cited pages (56). Backlinks show a 3x inverse gap. The always-cited pages have an average of 1.1 million backlinks, while the never-cited pages have 3.2 million.

Pages that get cited consistently have fewer links and lower DA than pages that never get cited.

The site-type breakdown is even more damning. Five of the highest-DA site types in the study produce wildly different citation rates: YouTube (DA 100) at 2.4%, Reddit (DA 92) at 29.9%, Major News (DA 94) at 32%, Health Publishers (DA 90) at 46.4%, Wikipedia (DA 95) at 59.2%. Almost identical authority. Citation rates spanning 25x.

DA tells you nothing about citation likelihood. Just like it tells you nothing about how Google evaluates content.

Length Isn’t the Answer

In the recent note on SEO concepts that aren’t helping you, I covered why word count chasing doesn’t work. The study confirms it.

The citation sweet spot is 500 to 2,000 words. Pages over 5,000 words underperform pages under 500 words. Long-form padding actively hurts you in AI search.

The reason is the same one that applies in traditional search. Word count itself does nothing. What helps is covering the topic with depth and specificity. What hurts is padding to hit a target. AI systems appear to be even less tolerant of filler than traditional search results, probably because they’re trying to extract specific, citable information rather than rank pages.

If your content strategy revolves around hitting word count targets, that strategy is working against you in both traditional and AI search.

Focused Beats Comprehensive

This finding partially complicates the standard SEO playbook. The “ultimate guide” approach to content has been a dominant strategy for years. The study suggests it actively hurts AI citation rates.

Pages covering 26 to 50% of ChatGPT’s fan-out subtopics outperform pages covering 100% of them. When primary query relevance is held constant, exhaustive coverage actually reduces citation rate.

The study’s interpretation: exhaustive coverage signals “generalist” content that addresses many topics without depth. Moderate coverage paired with strong primary relevance signals focused expertise.

This loosely connects to information gain. The point isn’t to cover everything that has ever been written about a topic. The point is to cover the right things with depth. A page that nails one question outperforms a page that adequately addresses five. Fan-out subtopics aren’t a content checklist. They’re context.

(Side note: read the recent note on information gain here. Also, I just published a new video expanding on that note that is worth checking out. You can watch that below or over on YouTube.)

If you’ve been building 5,000-word ultimate guides on the assumption that more comprehensive equals more rankable, this study says you should reconsider. Focused, deep coverage of the primary query is what gets cited.

Schema Markup Is a Real Signal

Pages with JSON-LD schema markup have a 6.5 percentage point citation advantage (38.5% vs 32%). The study verified this isn’t explained by other factors. Schema and non-schema pages have similar word counts, heading counts, DA, and query match scores. The schema markup itself is contributing the lift.

The top-performing schema types:

  • MedicalWebPage: 47% citation rate
  • BreadcrumbList: 46.2%
  • FAQPage: 45.6%
  • Organization: 44.3%
  • WebSite: 40.6%

Schema markup helps AI systems parse and categorize page content. If you’ve been treating schema as optional, this is a reason to reconsider. It’s one of the few signals in the study that delivers a clear advantage independent of everything else.

Write at a Higher Reading Level Than You Think

This one is genuinely counterintuitive. The “write for an 8th grader” advice has been floating around SEO content guidance for years. The study contradicts it directly.

Flesch-Kincaid grade 16-17 (college level) writing performs best at 35.9% citation rate. Kindergarten-level writing performs worst at 29.6%. The signal peaks at college-level vocabulary and sentence structure, then tapers slightly above grade 18.

The study’s interpretation is that expert-written content tends to use higher-grade vocabulary and more complex sentence structure, and AI systems appear to favor that signal as a marker of expertise.

The practical takeaway: don’t dumb your content down past the level of expertise your audience expects. If you’re writing for practitioners, write at a practitioner level. If you’re writing for technical audiences, use the technical language they actually use. Oversimplifying for an imagined “8th grade reader” who doesn’t exist in your actual audience may be costing you visibility in AI search.

The Takeaway

AI search is still SEO. The principles haven’t changed.

Rank well in retrieval, because nothing else matters if you can’t be found. Use headings that match the query, with proper structure for your page type. Write focused content of appropriate length. Use schema markup. Write at the reading level your audience actually expects. Don’t chase domain authority, because no one is using it.

The “AI changes everything” narrative was wrong. The “you need a completely new playbook” narrative was wrong. A few tactics need adjusting (length targets are tighter, exhaustive coverage hurts more than it helps, expert-level writing matters more than it did), but the fundamentals still work.

The fundamentals are still the work.

Read the full AirOps and Kevin Indig study.

Sign up for weekly notes straight from my vault.
Subscription Form (#5)

Tools I Use:

๐Ÿ”Ž ย Semrush – Competitor and Keyword Analysis

โœ…ย  Monday.com – For task management and organizing all of my client work

๐Ÿ“„ย  Frase – Content optimization and article briefs

๐Ÿ“ˆย  Keyword.com – Easy, accurate rank tracking

๐Ÿ—“๏ธย  Akiflow – Manage your calendar and daily tasks

๐Ÿ“Šย  Conductor Website Monitoring – Site crawler, monitoring, and audit tool

๐Ÿ‘‰ย  SEOPress โ€“ Itโ€™s like Yoast, if Yoast wasnโ€™t such a mess.

Sign Up So You Don't Miss the Next One:

vector representation of computers with data graphs
Subscription Form (#5)

Past tips you may have missed...