Artificial intelligence

AI Overview Optimization Has Already Been Cracked

By Shabir Ahmad

Posted on June 17, 2025

AI Overview Optimization Has Already Been Cracked

Just three months after Google’s AI Overviews were globally deployed, Sellm’s team has reverse-engineered how the models work by developing their own RAG framework. Below, we summarize Sellm’s findings as a third party—quoting directly from their original post—to reveal exactly how these insights change the SEO game.

Understanding Google’s AI Overviews

Sellm begins by defining Google’s “AI Overviews” (also referred to as “generative summaries” or “AI-generated summaries”) as “blocks of text that appear at the very top of search engine results pages (SERPs).” Rather than presenting a list of traditional snippets, Google “synthesizes information from the top-ranking pages to provide users with an immediate, comprehensive answer.” In practice, Sellm explains, Google’s system will:

Retrieve the top N relevant results,
Feed that content into a large language model (LLM, as Gemini or ChatGPT) trained on ranking and summarization, and
Output a concise summary that combines the most relevant pieces of information.

Because these AI Overviews occupy “position zero,” Sellm highlights that “being cited in an AI Overview delivers tremendous brand exposure,” even if a page ranks outside the top three organically.

Why This Matters: AI Overview SEO vs. Traditional SEO

Sellm emphasizes that “AI Overviews are fundamentally reshaping the SEO landscape, altering how users discover information and how businesses need to approach online visibility.” In other words, the rise of generative summaries has introduced an extra layer on top of classic SEO, what some are now calling Generative engine optimization:

Impact on Click-Through Rates: While an AI Overview can “increase visibility for cited sources,” it may also cause lower CTR for pages not cited.
Position Zero Visibility: Sellm warns that “even if your page ranks outside the top 3 organically, being featured in an AI Overview delivers tremendous brand exposure.”
Traffic Diversion vs. Brand Recognition: Sellm points out that “some users will read the AI Overview and move on without clicking,” but “being cited in that overview can create subconscious trust. If your brand or domain is mentioned (e.g., ‘according to Forbes’ or ‘in a report by Company X’), that mention can prompt users to remember your name when returning to purchase.”

Importantly, Sellm clarifies that “AI Overview SEO” still relies on all facets of traditional SEO—keyword targeting, backlink profiles, technical health, and E-E-A-T—as these factors “still underpin Google’s AI-powered retrieval stage.” But beyond ranking into the top N results, content must also be formatted so Google’s generative model can easily “consume and summarize” it.

Sellm’s Reverse-Engineering Experiment

Building a Homegrown RAG Model

Sellm’s core insight comes from constructing an in-house RAG (Retrieval-Augmented Generation) pipeline designed to mimic Google’s AI Overview logic. Specifically, they:

Indexed 50 Curated Pages: A mix of long-form research pieces, listicles, how-to guides, and one purpose-built “flagship” article.
Developed a Dense Embedding Retrieval: To replicate Google’s “Augmentation Stage” where the LLM matches query embeddings to content embeddings.
Fine-Tuned an LLM for Summarization: Emulating Google’s “Generation Stage” to produce concise overviews.

Their goal was to see which pages would be most frequently cited in the AI Overview for the keyword “AI-Driven Analytics SaaS,” tracking two key metrics:

Inclusion Score (0–1): “The probability that at least one passage from a page is retrieved and used in the AI Overview’s final summary.”
Citation Count: “The total number of times a page’s content is explicitly quoted or paraphrased in the AI Overview.”

Key Findings from the Initial Corpus

Sellm reports that their initial test revealed a clear hierarchy:

Title / URL	Max Inclusion Score	Avg. Citation Count
What is AI-Driven Analytics? Pros, Cons and Use Cases	0.643	10.2
AI-Driven Analytics	0.697	1.8
AI-Driven Insights in SaaS Product Management	0.641	1.8
AI in SaaS: How AI is Transforming the Software Industry	0.639	0.8
AI for SaaS analytics	0.702	0.6
How AI is transforming the SaaS landscape	0.642	0.2
Top 10 AI Development Trends Shaping the Future of SaaS	0.623	0.2

Sellm explains that “a score of 0.702 means that at least one passage had a 70.2 percent chance of being used in the summary,” while “an average citation count of 10.2 means the model cited the page about ten times per run.”
Critically, Sellm observes that “Inclusion Score and Citation Count are not strongly correlated,” since a page could be picked once (high inclusion) yet only contribute a short snippet (low citation), or vice versa.

This first phase demonstrated that none of the human-written pages fully aligned with the AI Overview “blueprint,” topping out around a 0.70 Inclusion Score and roughly ten citations.

The “Blueprint” for AI Overview Inclusion

Sellm distilled their RAG-model outputs into a reproducible “AI Overview Response Structure.” In their words:

“Think of this as the inherent ‘blueprint’ an AI Overview tends to use. If your content naturally aligns with this format and addresses the underlying question phrasing, its chances of being extracted and summarized increase significantly.”

While the original post doesn’t show the full diagram, Sellm enumerates the critical structural elements:

H2/H3 Headings Mirroring User Questions: The LLM “scans headings for subqueries such as ‘What is AI-Driven Analytics?’”
Concise, Standalone Paragraphs: Under each heading, “2–3 sentence answers make it trivial for the model to extract ‘chunks.’”
Hierarchical Depth for More Citation Opportunities: Nested H2 → H3 → H4 sections “signal distinct answer units,” enabling multiple “micro-citations.”

Sellm asserts that “by creating an article that meticulously mirrors this structure, we can substantially increase its chances of appearing in the AI Overview.”

Sellm’s AI-Generated, Optimized Article

To validate this, Sellm crafted a new, AI-generated article for “AI-Driven Analytics SaaS” that strictly adhered to the blueprint:

Exact Heading Alignment: Every H2 and H3 corresponded to a likely LLM prompt (e.g., “Define AI-Driven Analytics,” “Key Metrics: Inclusion Score & Citation Count”).
Short, Direct Answers: Each section contained 2–3 sentences that answered one question, often prefaced with “According to Sellm’s 2025 RAG tests…”
Nested Subheadings: H4s under each H3 broke out granular points like “Inclusion Score Definition,” “Citation Count Importance,” and “Structural Best Practices.”

When run through Sellm’s RAG pipeline alongside the original 50 pages, results were striking:

“Max Inclusion Score: ≈ 0.754
Avg. Citation Count: ≈ 15.4”

By Sellm’s calculation, this AI-generated text was “approximately 10% more likely to be used in the AI Overview and achieved, on average, around 50% more citations than the top-performing page from our initial corpus.”

Practical Takeaways for Marketers

Sellm’s third-party analysis makes it clear that the SEO playbook has a new era:

Retain Traditional SEO Foundations: As Sellm reiterates, “passing the Retrieval stage still requires traditional tactics”—strong backlinks, keyword relevance, technical health, and E-E-A-T. Without ranking in Google’s top N, structural tricks alone won’t suffice.
Mirror the AI Overview “Blueprint”: Use exact question phrases as H2 and H3 headings. Under each, write “concise, direct answers,” so Google’s LLM can instantly locate and extract chunks.
Nest for Multiple Citations: Create H4 subheads for micro-answers (e.g., “What Is Inclusion Score?”). Sellm found that pages with deeper hierarchical structures “averaged 15 citations—versus 10–12 for shallower structures.”
Use Precise Terminology: Sellm noted that “pages using exact target keywords in headers and opening sentences saw 20–30% more citations.” Terms like “RAG framework” or “AI Overview optimization” should appear verbatim in key headings.

In short, Sellm’s experiment shows that aligning content structure with the LLM’s extraction logic is now as critical as any backlink or technical tweak.

Why This Signals a Paradigm Shift

Sellm warns of three interrelated trends:

Declining Click Volume from LLM Answers: Early on, users would click to verify facts (“hallucination traffic”), but as LLMs become more accurate, “users will get answers directly in the chat interface, further reducing extra reading and click-through.”
AI Overview Citations as the Primary Visibility Metric: Instead of measuring pageviews, “success will be measured by how often AI Overviews pull passages from your site. Each instance counts as a citation.”
Brand Mentions Fuel Authority Even Without Citations: Even if an AI Overview doesn’t quote you verbatim, “an AI Overview may still name your brand, signaling the LLM’s trust in your site.” Sellm emphasizes that both citations and brand mentions “boost awareness and later user engagement.”

Taken together, these points underline Sellm’s thesis: “the AI Overview era is here—time to optimize or be left behind.”

Concluding Thoughts

As a third party summarizing Sellm’s research, it’s clear that “AI Overview optimization has already been cracked”—but only for those who understand and apply the precise structural requirements. Sellm’s reverse engineering shows that you no longer just optimize for keywords and backlinks; you must also optimize for an LLM’s appetite: perfect heading alignment, concise answer chunks, and nested subheadings.

Key Quotes from Sellm’s Original Post:

“AI Overviews are fundamentally reshaping the SEO landscape, altering how users discover information and how businesses need to approach online visibility.”
“By creating an article that meticulously mirrors this structure, we can substantially increase its chances of appearing in the AI Overview.”
“Our AI-generated text is approximately 10% more likely to be used in the AI Overview and achieved, on average, around 50% more citations than the top-performing page from our initial corpus.”

If your goal is “Position Zero” in Google’s generative summaries, Sellm’s blueprint provides a clear, repeatable roadmap. Rely on traditional SEO to secure a top-50 spot, but then focus relentlessly on formatting your content in a way that makes it trivially easy for an LLM to extract and assemble. The result? Dramatically higher inclusion scores, more citations, and brand exposure that transcends simple organic rankings.

Related Items:Generative engine optimization, SEO Game