How to Get Your Website Cited by AI - The 6-Step Playbook for Generative Engine Optimization (GEO) - AI Search Visibility Blog | Insights & Data | Otterly.AI

Hi there,

This is Thomas, CEO at OtterlyAI. Over the last months, I’ve onboarded dozens of marketing teams to OtterlyAI and AI Search Optimization / Generative Engine Optimization.

In this article, I summarized all the GEO (Generative Engine Optimization) activities that truly matter.

These are proven tactics and efforts I have either tested first-hand, seen implemented by our partners and customers, or at least gathered enough data points to determine if they work.

This is not your manual, guide, or ultimate tutorial. These are MY recommendations and tactics that I would suggest to marketing teams at this point.

Generative Engine Optimization – Key Metrics

Traditional metrics like impressions, click and traffic no longer accurately measure AI search success. Consumer behavior and User Interfaces (UI) are different; therefore, we need to define new KPIs to measure success on ChatGPT, Perplexity, and AI search.

These are my GEO KPIs for measuring AI search success

KPI	Definition	Our AI Search Goal
Brand Mentions	Total number of my brand mentions in selected time	Get more brand mentions
Brand Coverage	Percentage of prompts that mention my brand compared to all prompts	Improve our coverage across all prompts
Domain Citations	total number of domain citations in selected time	Get more domain citations
Domain Coverage	Percentage of prompts that cite my domain compared to all prompts	Improve our coverage across all prompts

You can check out the whole Definition of our KPIs here.

Why Do They Matter?

It’s about how we show up and get recommended as a brand on ChatGPT and AI Search.
Brand and brand influence define whether we are top of mind among our ideal customers.
Instead of link tracking, it’s all about citations and whether our content is getting cited on AI Search.

So, now let’s see how we can improve those KPIs for higher AI search visibility.

How to get my website cited on ChatGPT?

These are the key steps to get your website cited on ChatGPT.

Step	Action	Why It Matters
1. Check CDN/Host AI Access	Ensure your Content Delivery Network (CDN) or web host allows AI crawlers (e.g. GPTBot)	Some providers block AI bots by default, which can make your content invisible
2. Configure `robots.txt`	Explicitly allow AI bots to crawl specific parts of your site	If not properly configured, bots may skip your content — even if it’s valuable.
3. Reduce JavaScript-Only Content	Ensure important content loads as static HTML	AI crawlers don’t execute JavaScript. Dynamic content that relies on client‑side rendering often is not seen.
4. Optimize Content Structure	Use GEO tools to analyze content clarity, chunking, and tone	Well-structured, plain-language content is easier for AI to retrieve and cite.
5. Run a Relevance Check	Align each section with target prompts using semantic scoring tools	The more semantically relevant your content, the more likely it appears in AI responses.
6. Nail the SEO Basics	Ensure clean HTML, headings, metadata, mobile design, and language tags	These fundamentals help both traditional and AI search engines index your content properly.

Step 1: Does my CDN/Web Host allow AI crawlers to access my website?

Some Content Delivery Networks (CDNs) and web hosts provide you with settings to control which bots can access your website and which cannot. Here’s a screenshot from Cloudflare—a popular CDN—where you can see the settings options:

Block AI training bots
Instruct AI bots via the robots.txt file

We found that many CDNs and web hosts block AI crawlers by default. Better be sure and send the support team of your CDN provider an email asking for clarity on this. You can have the best content in the world (or in your industry), yet it might still be invisible to AI search engines because your CDN is blocking their AI crawlers.

You can simulate whether different AI crawlers can access your website with the OtterlyAI Crawler Simulation Tool.

Step 2: Do we block or allow AI crawlers to access our websites via the robots.txt file

Another important housekeeping aspect: Which crawlers are allowed or disallowed via our robots.txt file?

Allow by default: If a crawler is not mentioned in your robots.txt file, it’s effectively allowed to crawl your site.
Use the Allow: directive: To explicitly permit certain crawlers or sections of your site for AI training, you can use a syntax such as:

    User-agent: GPTBot
    Allow: /training-data/

This allows the crawler to access only the specified directory.

Important disclaimer: The instructions in robots.txt files cannot enforce crawler behavior on your site; it’s up to the crawler to obey them. Most crawlers obey robots.txt, although it is not enforceable.

Step 3: How much dynamic content do we have on specific URLs of ours?

It’s important to check the ratio between static and dynamic content on our website, or on any particular URL. Why?

AI crawlers do not execute JavaScript. GPTBot and other OpenAI crawlers see raw HTML; they don’t reliably execute client-side JavaScript. If content is behind JavaScript rendering (client-side), that content may be “invisible.” In other words, if your page is a Single Page Application (SPA) or uses heavy client-side rendering such that the HTML skeleton is mostly empty or consists of placeholders and the “real” content comes later via JS, the AI crawlers may only see those placeholders. (source)

AI Crawlers	Execute Javascript?	See Javascript Content?
Google (Gemini, Googlebot)	Yes ✅	Yes ✅
GPTBot, OAI-Searchbot, ChatGPT-User	No ❌	No ❌
PerplexityBot, ClaudeBot	No ❌	No ❌

When you log into your OtterlyAI account and head over to the GEO audit, you can run any URL audit to check the static vs dynamic content ratio.

As you can see here, it’s important that we have a static ratio score, ensuring that our content is accessible and crawlable by different AI crawlers.

Bonus: Pros vs. Cons: Static HTML vs Dynamic/Client-Side Rendering (for AI Crawler Visibility)

Rendering Type	Pros	Cons
Static HTML	✅ Fully visible to AI crawlers like GPTBot, ClaudeBot, and PerplexityBot ✅ Faster indexing and citation potential ✅ Easier to test for visibility	❌ Less flexible for dynamic UIs ❌ Requires build-time or server-side content generation
Dynamic / Client-Side Rendering (JS-heavy)	✅ Flexible for building complex user interfaces ✅ Great for interactivity and SPAs	❌ Invisible to most AI crawlers (they don’t execute JavaScript) ❌ Content may not be indexed or cited at all ❌ Harder to validate what crawlers can “see”

Step 4: On Page – Content Chunk Analysis

Now it’s time to analyze and optimize our content for better retrieval. I currently rely heavily on this CustomGPT “GEO Optimizer” created by Andrei Iunisov. This custom GPT analyzes any content or URL based on the following checks:

GEO Analysis	Description
Section Structure	Does each section express a self-contained idea and is it clearly labeled?
Plain Language & Direct Questions/Answers	Does the text use plain‐language questions and direct answers?
Jargon, Metaphors, Clever Intros	Any jargon/metaphors or fancy intros that could be simplified?
Unexplained Acronyms	Any acronyms not explained?
Modular Content Ideas	What modular additions could enhance the text?
Language Clarity and Assertiveness	Are there vague qualifiers or non-assertive phrasing?
Semantic Redundancy Suggestions	Could key ideas be rephrased across different parts for better retrieval?
Paragraph Idea Unity	Do paragraphs contain multiple ideas that should be split?
Clarify Brands, People, Tools, Numbers	Are all entities clearly described?
Key Takeaways or Quick Recap	Recommendations on how to recap and summarize content
External Linking Proposals	Add useful links (government, research, official spec)
Illustration Ideas	Illustrations or graphics that could enhance the blog
Tone Adjustments for Professionalism	Any tone adjustment recommendations on Content

Why is this check useful?

It makes our content better and, therefore, improves the chances of our content being retrieved for ChatGPT.

Step 5: On Page Content – Relevance Check

Next, we want to analyze our text for semantic similarity so we can see exactly which paragraph or section aligns best with our target term/target prompt. You can read more about content relevance here.

Personally, I’ve been using Relevance Doctor by iPullRank, a tool for measuring semantic similarity between prompts and content.

The goal is to have text sections that are semantically very similar and closely aligned with our target query. As you can see in the screenshot below, each paragraph is then scored with a similarity score.

Step 6: Content – Are the SEO basics in place?

In addition to the steps outlined above, I always recommend making sure that the following SEO basics are met. Consider them important add-ons to your Generative Engine Optimization strategy.

A clean HTML structure
Correct Heading Hierarchy
Good navigational structure
Correct Meta data
Mobile-friendly design
Clear language tags

Key Takeaways: How to Get Cited in AI Search

Want your content to show up in ChatGPT, Perplexity, Claude, and other AI search engines? These are your must-do moves:

Check your CDN and hosting setup – Make sure AI crawlers aren’t blocked by default.
Use robots.txt intentionally – Explicitly allow AI bots like GPTBot access to your training-worthy pages.
Stick to static HTML for key content – If it’s hidden behind JavaScript, AI crawlers likely won’t see it.
Structure content into clear, self-contained chunks – Use plain language, direct answers, and avoid clever fluff.
Ensure high semantic relevance – Use tools to align content closely with the kinds of queries your audience asks.
Don’t skip SEO basics – Clean code, proper headings, mobile-friendly design, and metadata still matter.
Get your brand in listicles – These mentions boost both SEO and AI visibility. Link building services provider Editorial.Link helps get them featured.

🎯 Do these six things right, and you’re not just publishing content. You’re making it retrievable, relevant, and ready for the AI era.

GEO Optimization – Frequently Asked Questions (FAQs)

Q: Do all AI crawlers obey the robots.txt file?
A: Most reputable AI crawlers . including GPTBot (OpenAI), ClaudeBot (Anthropic), and PerplexityBot. respect the directives in your robots.txt file. However, this is voluntary. There is no enforcement mechanism beyond crawler compliance.

Q2: Will allowing GPTBot cause any privacy or legal issues?
A: Allowing GPTBot to crawl your site means your public content can be used to train AI / LLM models and inform AI responses. If your pages contain sensitive data, personal information, or regulated content (e.g. health, finance), you should consider excluding those paths via robots.txt. Always consult your legal team for jurisdiction-specific compliance.

Q3: How often should I check my content’s semantic relevance?
A: Ideally, audit your semantic relevance regularly (ideally monthly or quarterly) – especially after major content updates or changes in search behavior. Tools like Relevance Doctor or OtterlyAI’s GEO Audit can show how closely your content aligns with current AI search queries.