Search across blog posts, projects, and services

Press ⌘K or Ctrl+K to search

Published: April 12, 2026•11 min read

How to Get Cited by ChatGPT, Perplexity, and Claude

Step-by-step guide to getting your brand cited by ChatGPT, Perplexity, and Claude. Technical setup, content structure, and what actually works in 2026.

by Lloyd Pilapil

Step-by-step guide showing technical setup, content structure, and monitoring process for getting cited by ChatGPT, Perplexity, Claude, and Gemini

Getting Cited by AI Is a Technical Problem, Not a Content Problem

Getting cited by ChatGPT, Perplexity, Claude, and Gemini requires solving a technical problem first. Most brands assume they need better content. In reality, the most common reason for zero AI citations is that AI crawlers cannot access the site at all.

Domain traffic is the strongest predictor of AI citations: high-traffic sites earn 3 times more citations than low-traffic ones. But traffic alone is not enough. If your robots.txt blocks GPTBot or ClaudeBot, if you have no structured data, or if your content is not formatted for extraction, AI models will ignore you regardless of your domain authority.

This guide covers the complete process: technical access, structured data, content formatting, and ongoing monitoring. It is based on Pixelmojo's own journey from 0 out of 4 LLMs citing us to 4 out of 4 in 6 months.

more AI citations for high-traffic sites vs low-traffic ones. Domain authority still matters in AI search.

Source: AI Citation Research, 2026

Step 1: Unblock AI Crawlers

The single most impactful action you can take. If AI bots cannot access your site, nothing else in this guide matters.

Check Your robots.txt

Your robots.txt file controls which bots can access your site. Many sites block AI crawlers either intentionally (to prevent training) or accidentally (through overly broad rules).

The bots you need to allow for AI citation:

GPTBot: OpenAI's crawler for ChatGPT
ChatGPT-User: ChatGPT browsing mode
ClaudeBot / anthropic-ai: Anthropic's crawlers for Claude
PerplexityBot: Perplexity's search crawler
GoogleOther: Google's AI-specific crawler

You can block training-specific crawlers (CCBot, Google-Extended, Bytespider) while keeping browse crawlers allowed. This lets AI cite your content without using it for model training.

How to Test

Use the AI Crawl Checker (free) to test all 13 AI bot user-agents against your site. Or run a full Radar audit to check crawl access alongside 11 other AI visibility dimensions.

“We were invisible to all 4 AI models in October 2025. The cause was a two-line robots.txt rule blocking GPTBot and ClaudeBot. It took 5 minutes to fix and 3 weeks for citations to start appearing.”

Lloyd Pilapil, Pixelmojo (from 516 git commits of proof)

Step 2: Add Structured Data

Structured data (JSON-LD schema markup) tells AI models what your site is about in a machine-readable format. Without it, AI models have to guess your entity relationships, often incorrectly.

The 5 Schema Types That Matter Most

Schema Type	What It Tells AI	Priority
Organization	Who you are, what you do, where you are located	Critical
Article	Authorship, publish date, topic for each page	Critical
FAQPage	Explicit Q&A pairs AI can extract verbatim	High
BreadcrumbList	Site structure and page hierarchy	Medium
SpeakableSpecification	Which content to use for voice and AI answers	Medium

Implementation Tips

Organization schema should include sameAs links to your LinkedIn, social profiles, and any Wikipedia or Wikidata entries. This helps AI models disambiguate your brand from similarly named companies.

Article schema should include the author as a Person type (not Organization) with a URL to their profile page. AI models use author signals as quality indicators.

FAQPage schema provides pre-formatted question-answer pairs that AI can extract and cite directly. Every FAQ question in your schema should match a question people actually ask AI about your category.

6 months

from 0/4 LLM citations to 4/4 using these exact steps. 516 git commits of proof.

Source: Pixelmojo origin story

Step 3: Create an llms.txt File

llms.txt is an emerging standard file (like robots.txt) that tells AI models what your site is about in plain language. Place it at yourdomain.com/llms.txt or yourdomain.com/.well-known/llms.txt.

What to Include

A good llms.txt file contains:

Company name and one-sentence description
What your products or services do
Key facts (founding date, location, team size)
Links to important pages (products, pricing, blog)
Contact information
Citation and attribution guidelines

Why It Works

Without llms.txt, AI models piece together information about your brand from scattered web pages, social media, and third-party mentions. This leads to hallucinations and inaccurate descriptions. An llms.txt file gives AI a single authoritative source.

You can validate your llms.txt using the llms.txt Validator (free). For implementation guidance, read our llms.txt implementation guide.

“Sites with llms.txt files give AI models a clear authoritative source of truth about the brand. Without one, AI models guess from scattered data and often guess wrong.”

GEO Best Practices, 2026

Step 4: Format Content for AI Extraction

AI models do not read content the way humans do. They extract passages, tables, and structured answers. Content formatted for extraction gets cited more often.

Answer-First Paragraphs (BLUF)

The first 1 to 2 sentences under every H2 heading should be a standalone, quotable answer. AI models often extract just the opening paragraph of a section. If that paragraph requires reading the rest of the section to make sense, AI will skip it.

Bad: "In today's rapidly evolving landscape, many businesses are wondering about the impact of AI on their marketing strategies. Let us explore this complex topic."

Good: "AI search now handles 22 percent of all searches in 2026. Brands invisible to ChatGPT, Perplexity, and Claude lose discovery they cannot recover through traditional SEO alone."

HTML Tables for Comparison Data

AI models extract HTML tables almost verbatim. Never write comparison data as prose. If you are comparing tools, pricing, or features, put it in a table.

FAQ Sections with Clear Q&A Pairs

FAQ sections serve double duty: they match the question-answer format that AI models prefer, and they provide FAQPage schema that AI can extract directly.

Clear H2/H3 Hierarchy

Each H2 should be a self-contained topic. Each H3 should be a subtopic that can stand alone. AI retrieves individual passages, not full pages. A well-structured hierarchy means every section is independently citable.

22%

of all searches handled by AI in 2026, up from 15% in 2025. Growing at 8% month-over-month.

Source: AI Search Statistics, 2026

Step 5: Build Domain Authority (Traditional SEO Still Matters)

Domain traffic is the number one predictor of AI citations. High-traffic sites earn 3 times more citations than low-traffic ones. This means traditional SEO and AI visibility are not competing strategies: they compound each other.

What builds domain authority for AI citation:

Backlinks from authoritative domains signal trust to AI models
Consistent publishing on your topic cluster builds topical authority
Third-party mentions (press, reviews, comparisons) give AI models corroborating sources
Reddit and forum presence matters because Reddit is a primary training data source for LLMs

The key difference: traditional SEO optimizes for Google's ranking algorithm. AI citation requires the same authority signals PLUS the technical layer (structured data, llms.txt, bot access) that Google does not require.

For deeper analysis of how SEO and GEO differ, read our SEO vs GEO guide.

“Domain traffic is the number one predictor of AI citations. High-traffic sites earn 3x more citations. Traditional SEO and AI visibility compound each other.”

AI Citation Research, 2026

Step 6: Monitor and Iterate

Getting cited is not a one-time achievement. AI models update frequently. A citation you have today can disappear next week if a competitor publishes stronger content or the model retrains.

Weekly Monitoring

Run 5 category prompts across ChatGPT, Perplexity, Claude, and Gemini every week. Track whether your brand appears, where it ranks in the response, and what competitors are mentioned.

Monthly Technical Audit

Re-run a Radar audit monthly to catch technical regressions. A hosting migration, CMS update, or robots.txt change can silently break AI access.

Track What Works

When a new citation appears, trace it back to the change that caused it. Was it a new blog post, a schema update, an llms.txt improvement, or a backlink from an authoritative source? This feedback loop accelerates future optimization.

For a complete guide to tracking tools and methods, see How to Track AI Citations.

The Complete Checklist

Step	Action	Time to Impact	Tools
1. Crawl Access	Unblock GPTBot, ClaudeBot, PerplexityBot in robots.txt	2-4 weeks	AI Crawl Checker (free)
2. Structured Data	Add Organization, Article, FAQPage, BreadcrumbList schema	2-4 weeks	Schema Audit in Radar
3. llms.txt	Create and publish llms.txt with company info and products	2-4 weeks	llms.txt Validator (free)
4. Content Format	Answer-first paragraphs, tables, FAQs, clear headings	4-8 weeks	AEO Page Auditor (free)
5. Domain Authority	Backlinks, publishing, third-party mentions	Ongoing	Existing SEO tools
6. Monitor	Weekly prompt tests, monthly Radar audits	Ongoing	Radar + manual testing

Getting Cited by AI: Questions Readers Ask

Common questions about this topic, answered.

Technical fixes (robots.txt, structured data, llms.txt) can take effect within 2 to 4 weeks as AI crawlers re-index your site. Content authority changes take longer, typically 4 to 8 weeks. Pixelmojo went from 0 out of 4 LLMs citing us to 4 out of 4 in 6 months of systematic optimization.

Technical access. If AI crawlers like GPTBot and ClaudeBot are blocked by your robots.txt, nothing else matters. After fixing technical access, the next priorities are structured data (JSON-LD schema), an llms.txt file, and answer-first content structure.

Partially. Domain authority and high-traffic pages correlate with AI citations because high-traffic sites earn 3x more citations than low-traffic ones. But AI citation requires additional signals that traditional SEO does not cover: llms.txt files, speakable schema, answer-first content formatting, and explicit entity signals.

It is not strictly required but strongly recommended. llms.txt is an emerging standard that tells AI models what your site is about, what your products do, and how to reference you. Sites with well-structured llms.txt files give AI models a clear, authoritative source of information about the brand, reducing hallucination risk.

Perplexity performs real-time web searches and pulls from live indexed content. ChatGPT relies more heavily on training data, which is updated periodically. A site that ranks well in web search will often appear in Perplexity first. ChatGPT citations depend on your content being included in training data updates, which happens on a different schedule.

No. There is no paid placement in AI search results as of 2026. Citations are earned through content quality, technical accessibility, domain authority, and structured data. Some platforms like Perplexity are testing ad placements, but organic citations remain the primary discovery path.

Use Radar by Pixelmojo (free first audit) or the standalone AI Crawl Checker tool at pixelmojo.io/tools/ai-crawl-checker. These test 13 AI bot user-agents including GPTBot, ClaudeBot, PerplexityBot, and GoogleOther against your site to verify crawl access, robots.txt rules, and technical readiness.

AI models prefer content that is easy to extract and verify: HTML tables, FAQ sections with clear question-answer pairs, definition paragraphs that open with a standalone answer (BLUF format), structured data with explicit entity relationships, and content with cited sources. Prose-heavy pages without structure are harder for AI to extract and cite.

Start Now

Every day without AI visibility is traffic you are not getting from the highest-converting discovery channel available. The technical setup takes under an hour. The results compound over weeks.

Ready to get cited by AI search engines?

Run a free Radar audit to see your starting position across all 12 dimensions
AI Visibility Strategy if you want us to handle everything ($4,500 sprint)
Read the GEO playbook for the full methodology
Contact us for a 30-minute strategy call

About the Author

Lloyd Pilapil

Founder & AI Product Architect at Pixelmojo

Lloyd Pilapil is the founder of Pixelmojo and a former Salesforce engineer who builds production AI systems for B2B companies. He writes about agentic AI, multi-agent orchestration, AX (Agentic Experience) design, GEO, and Thread-Based Engineering. His work focuses on shipping AI products that generate revenue, not prototypes.

Expertise

Agentic AI SystemsMulti-Agent OrchestrationAX DesignGEO & AI SearchThread-Based EngineeringAI Product DevelopmentGrowth MarketingUI/UX Design

Getting Cited by AI Is a Technical Problem, Not a Content Problem

more AI citations for high-traffic sites vs low-traffic ones. Domain authority still matters in AI search.

Source: AI Citation Research, 2026

Step 1: Unblock AI Crawlers

The single most impactful action you can take. If AI bots cannot access your site, nothing else in this guide matters.

Check Your robots.txt

Your robots.txt file controls which bots can access your site. Many sites block AI crawlers either intentionally (to prevent training) or accidentally (through overly broad rules).

The bots you need to allow for AI citation:

GPTBot: OpenAI's crawler for ChatGPT
ChatGPT-User: ChatGPT browsing mode
ClaudeBot / anthropic-ai: Anthropic's crawlers for Claude
PerplexityBot: Perplexity's search crawler
GoogleOther: Google's AI-specific crawler

You can block training-specific crawlers (CCBot, Google-Extended, Bytespider) while keeping browse crawlers allowed. This lets AI cite your content without using it for model training.

How to Test

Use the AI Crawl Checker (free) to test all 13 AI bot user-agents against your site. Or run a full Radar audit to check crawl access alongside 11 other AI visibility dimensions.

Lloyd Pilapil, Pixelmojo (from 516 git commits of proof)

Step 2: Add Structured Data

Structured data (JSON-LD schema markup) tells AI models what your site is about in a machine-readable format. Without it, AI models have to guess your entity relationships, often incorrectly.

The 5 Schema Types That Matter Most

Schema Type	What It Tells AI	Priority
Organization	Who you are, what you do, where you are located	Critical
Article	Authorship, publish date, topic for each page	Critical
FAQPage	Explicit Q&A pairs AI can extract verbatim	High
BreadcrumbList	Site structure and page hierarchy	Medium
SpeakableSpecification	Which content to use for voice and AI answers	Medium

Implementation Tips

Article schema should include the author as a Person type (not Organization) with a URL to their profile page. AI models use author signals as quality indicators.

6 months

from 0/4 LLM citations to 4/4 using these exact steps. 516 git commits of proof.

Source: Pixelmojo origin story

Step 3: Create an llms.txt File

llms.txt is an emerging standard file (like robots.txt) that tells AI models what your site is about in plain language. Place it at yourdomain.com/llms.txt or yourdomain.com/.well-known/llms.txt.

What to Include

A good llms.txt file contains:

Company name and one-sentence description
What your products or services do
Key facts (founding date, location, team size)
Links to important pages (products, pricing, blog)
Contact information
Citation and attribution guidelines

Why It Works

You can validate your llms.txt using the llms.txt Validator (free). For implementation guidance, read our llms.txt implementation guide.

“Sites with llms.txt files give AI models a clear authoritative source of truth about the brand. Without one, AI models guess from scattered data and often guess wrong.”

GEO Best Practices, 2026

Step 4: Format Content for AI Extraction

AI models do not read content the way humans do. They extract passages, tables, and structured answers. Content formatted for extraction gets cited more often.

Answer-First Paragraphs (BLUF)

Bad: "In today's rapidly evolving landscape, many businesses are wondering about the impact of AI on their marketing strategies. Let us explore this complex topic."

Good: "AI search now handles 22 percent of all searches in 2026. Brands invisible to ChatGPT, Perplexity, and Claude lose discovery they cannot recover through traditional SEO alone."

HTML Tables for Comparison Data

AI models extract HTML tables almost verbatim. Never write comparison data as prose. If you are comparing tools, pricing, or features, put it in a table.

FAQ Sections with Clear Q&A Pairs

FAQ sections serve double duty: they match the question-answer format that AI models prefer, and they provide FAQPage schema that AI can extract directly.

Clear H2/H3 Hierarchy

22%

of all searches handled by AI in 2026, up from 15% in 2025. Growing at 8% month-over-month.

Source: AI Search Statistics, 2026

Step 5: Build Domain Authority (Traditional SEO Still Matters)

What builds domain authority for AI citation:

Backlinks from authoritative domains signal trust to AI models
Consistent publishing on your topic cluster builds topical authority
Third-party mentions (press, reviews, comparisons) give AI models corroborating sources
Reddit and forum presence matters because Reddit is a primary training data source for LLMs

For deeper analysis of how SEO and GEO differ, read our SEO vs GEO guide.

“Domain traffic is the number one predictor of AI citations. High-traffic sites earn 3x more citations. Traditional SEO and AI visibility compound each other.”

AI Citation Research, 2026

Step 6: Monitor and Iterate

Getting cited is not a one-time achievement. AI models update frequently. A citation you have today can disappear next week if a competitor publishes stronger content or the model retrains.

Weekly Monitoring

Run 5 category prompts across ChatGPT, Perplexity, Claude, and Gemini every week. Track whether your brand appears, where it ranks in the response, and what competitors are mentioned.

Monthly Technical Audit

Re-run a Radar audit monthly to catch technical regressions. A hosting migration, CMS update, or robots.txt change can silently break AI access.

Track What Works

For a complete guide to tracking tools and methods, see How to Track AI Citations.

The Complete Checklist

Step	Action	Time to Impact	Tools
1. Crawl Access	Unblock GPTBot, ClaudeBot, PerplexityBot in robots.txt	2-4 weeks	AI Crawl Checker (free)
2. Structured Data	Add Organization, Article, FAQPage, BreadcrumbList schema	2-4 weeks	Schema Audit in Radar
3. llms.txt	Create and publish llms.txt with company info and products	2-4 weeks	llms.txt Validator (free)
4. Content Format	Answer-first paragraphs, tables, FAQs, clear headings	4-8 weeks	AEO Page Auditor (free)
5. Domain Authority	Backlinks, publishing, third-party mentions	Ongoing	Existing SEO tools
6. Monitor	Weekly prompt tests, monthly Radar audits	Ongoing	Radar + manual testing

Getting Cited by AI: Questions Readers Ask

Common questions about this topic, answered.

Start Now

Every day without AI visibility is traffic you are not getting from the highest-converting discovery channel available. The technical setup takes under an hour. The results compound over weeks.

Ready to get cited by AI search engines?

Run a free Radar audit to see your starting position across all 12 dimensions
AI Visibility Strategy if you want us to handle everything ($4,500 sprint)
Read the GEO playbook for the full methodology
Contact us for a 30-minute strategy call

About the Author

Lloyd Pilapil

Founder & AI Product Architect at Pixelmojo

Expertise

Agentic AI SystemsMulti-Agent OrchestrationAX DesignGEO & AI SearchThread-Based EngineeringAI Product DevelopmentGrowth MarketingUI/UX Design

How to Get Cited by ChatGPT, Perplexity, and Claude

Step-by-step guide to getting your brand cited by ChatGPT, Perplexity, and Claude. Technical setup, content structure, and what actually works in 2026.

by Lloyd Pilapil

Getting Cited by AI Is a Technical Problem, Not a Content Problem

more AI citations for high-traffic sites vs low-traffic ones. Domain authority still matters in AI search.

Source: AI Citation Research, 2026

Step 1: Unblock AI Crawlers

The single most impactful action you can take. If AI bots cannot access your site, nothing else in this guide matters.

Check Your robots.txt

Your robots.txt file controls which bots can access your site. Many sites block AI crawlers either intentionally (to prevent training) or accidentally (through overly broad rules).

The bots you need to allow for AI citation:

GPTBot: OpenAI's crawler for ChatGPT
ChatGPT-User: ChatGPT browsing mode
ClaudeBot / anthropic-ai: Anthropic's crawlers for Claude
PerplexityBot: Perplexity's search crawler
GoogleOther: Google's AI-specific crawler

You can block training-specific crawlers (CCBot, Google-Extended, Bytespider) while keeping browse crawlers allowed. This lets AI cite your content without using it for model training.

How to Test

Use the AI Crawl Checker (free) to test all 13 AI bot user-agents against your site. Or run a full Radar audit to check crawl access alongside 11 other AI visibility dimensions.

Lloyd Pilapil, Pixelmojo (from 516 git commits of proof)

Step 2: Add Structured Data

Structured data (JSON-LD schema markup) tells AI models what your site is about in a machine-readable format. Without it, AI models have to guess your entity relationships, often incorrectly.

The 5 Schema Types That Matter Most

Schema Type	What It Tells AI	Priority
Organization	Who you are, what you do, where you are located	Critical
Article	Authorship, publish date, topic for each page	Critical
FAQPage	Explicit Q&A pairs AI can extract verbatim	High
BreadcrumbList	Site structure and page hierarchy	Medium
SpeakableSpecification	Which content to use for voice and AI answers	Medium

Implementation Tips

Article schema should include the author as a Person type (not Organization) with a URL to their profile page. AI models use author signals as quality indicators.

6 months

from 0/4 LLM citations to 4/4 using these exact steps. 516 git commits of proof.

Source: Pixelmojo origin story

Step 3: Create an llms.txt File

llms.txt is an emerging standard file (like robots.txt) that tells AI models what your site is about in plain language. Place it at yourdomain.com/llms.txt or yourdomain.com/.well-known/llms.txt.

What to Include

A good llms.txt file contains:

Company name and one-sentence description
What your products or services do
Key facts (founding date, location, team size)
Links to important pages (products, pricing, blog)
Contact information
Citation and attribution guidelines

Why It Works

You can validate your llms.txt using the llms.txt Validator (free). For implementation guidance, read our llms.txt implementation guide.

“Sites with llms.txt files give AI models a clear authoritative source of truth about the brand. Without one, AI models guess from scattered data and often guess wrong.”

GEO Best Practices, 2026

Step 4: Format Content for AI Extraction

AI models do not read content the way humans do. They extract passages, tables, and structured answers. Content formatted for extraction gets cited more often.

Answer-First Paragraphs (BLUF)

Bad: "In today's rapidly evolving landscape, many businesses are wondering about the impact of AI on their marketing strategies. Let us explore this complex topic."

Good: "AI search now handles 22 percent of all searches in 2026. Brands invisible to ChatGPT, Perplexity, and Claude lose discovery they cannot recover through traditional SEO alone."

HTML Tables for Comparison Data

AI models extract HTML tables almost verbatim. Never write comparison data as prose. If you are comparing tools, pricing, or features, put it in a table.

FAQ Sections with Clear Q&A Pairs

FAQ sections serve double duty: they match the question-answer format that AI models prefer, and they provide FAQPage schema that AI can extract directly.

Clear H2/H3 Hierarchy

22%

of all searches handled by AI in 2026, up from 15% in 2025. Growing at 8% month-over-month.

Source: AI Search Statistics, 2026

Step 5: Build Domain Authority (Traditional SEO Still Matters)

What builds domain authority for AI citation:

Backlinks from authoritative domains signal trust to AI models
Consistent publishing on your topic cluster builds topical authority
Third-party mentions (press, reviews, comparisons) give AI models corroborating sources
Reddit and forum presence matters because Reddit is a primary training data source for LLMs

For deeper analysis of how SEO and GEO differ, read our SEO vs GEO guide.

“Domain traffic is the number one predictor of AI citations. High-traffic sites earn 3x more citations. Traditional SEO and AI visibility compound each other.”

AI Citation Research, 2026

Step 6: Monitor and Iterate

Getting cited is not a one-time achievement. AI models update frequently. A citation you have today can disappear next week if a competitor publishes stronger content or the model retrains.

Weekly Monitoring

Run 5 category prompts across ChatGPT, Perplexity, Claude, and Gemini every week. Track whether your brand appears, where it ranks in the response, and what competitors are mentioned.

Monthly Technical Audit

Re-run a Radar audit monthly to catch technical regressions. A hosting migration, CMS update, or robots.txt change can silently break AI access.

Track What Works

For a complete guide to tracking tools and methods, see How to Track AI Citations.

The Complete Checklist

Step	Action	Time to Impact	Tools
1. Crawl Access	Unblock GPTBot, ClaudeBot, PerplexityBot in robots.txt	2-4 weeks	AI Crawl Checker (free)
2. Structured Data	Add Organization, Article, FAQPage, BreadcrumbList schema	2-4 weeks	Schema Audit in Radar
3. llms.txt	Create and publish llms.txt with company info and products	2-4 weeks	llms.txt Validator (free)
4. Content Format	Answer-first paragraphs, tables, FAQs, clear headings	4-8 weeks	AEO Page Auditor (free)
5. Domain Authority	Backlinks, publishing, third-party mentions	Ongoing	Existing SEO tools
6. Monitor	Weekly prompt tests, monthly Radar audits	Ongoing	Radar + manual testing

Getting Cited by AI: Questions Readers Ask

Common questions about this topic, answered.

Start Now

Every day without AI visibility is traffic you are not getting from the highest-converting discovery channel available. The technical setup takes under an hour. The results compound over weeks.

Ready to get cited by AI search engines?

Run a free Radar audit to see your starting position across all 12 dimensions
AI Visibility Strategy if you want us to handle everything ($4,500 sprint)
Read the GEO playbook for the full methodology
Contact us for a 30-minute strategy call

About the Author

Lloyd Pilapil

Founder & AI Product Architect at Pixelmojo

Expertise

Agentic AI SystemsMulti-Agent OrchestrationAX DesignGEO & AI SearchThread-Based EngineeringAI Product DevelopmentGrowth MarketingUI/UX Design

Getting Cited by AI Is a Technical Problem, Not a Content Problem

more AI citations for high-traffic sites vs low-traffic ones. Domain authority still matters in AI search.

Source: AI Citation Research, 2026

Step 1: Unblock AI Crawlers

The single most impactful action you can take. If AI bots cannot access your site, nothing else in this guide matters.

Check Your robots.txt

Your robots.txt file controls which bots can access your site. Many sites block AI crawlers either intentionally (to prevent training) or accidentally (through overly broad rules).

The bots you need to allow for AI citation:

GPTBot: OpenAI's crawler for ChatGPT
ChatGPT-User: ChatGPT browsing mode
ClaudeBot / anthropic-ai: Anthropic's crawlers for Claude
PerplexityBot: Perplexity's search crawler
GoogleOther: Google's AI-specific crawler

You can block training-specific crawlers (CCBot, Google-Extended, Bytespider) while keeping browse crawlers allowed. This lets AI cite your content without using it for model training.

How to Test

Use the AI Crawl Checker (free) to test all 13 AI bot user-agents against your site. Or run a full Radar audit to check crawl access alongside 11 other AI visibility dimensions.

Lloyd Pilapil, Pixelmojo (from 516 git commits of proof)

Step 2: Add Structured Data

Structured data (JSON-LD schema markup) tells AI models what your site is about in a machine-readable format. Without it, AI models have to guess your entity relationships, often incorrectly.

The 5 Schema Types That Matter Most

Schema Type	What It Tells AI	Priority
Organization	Who you are, what you do, where you are located	Critical
Article	Authorship, publish date, topic for each page	Critical
FAQPage	Explicit Q&A pairs AI can extract verbatim	High
BreadcrumbList	Site structure and page hierarchy	Medium
SpeakableSpecification	Which content to use for voice and AI answers	Medium

Implementation Tips

Article schema should include the author as a Person type (not Organization) with a URL to their profile page. AI models use author signals as quality indicators.

6 months

from 0/4 LLM citations to 4/4 using these exact steps. 516 git commits of proof.

Source: Pixelmojo origin story

Step 3: Create an llms.txt File

llms.txt is an emerging standard file (like robots.txt) that tells AI models what your site is about in plain language. Place it at yourdomain.com/llms.txt or yourdomain.com/.well-known/llms.txt.

What to Include

A good llms.txt file contains:

Company name and one-sentence description
What your products or services do
Key facts (founding date, location, team size)
Links to important pages (products, pricing, blog)
Contact information
Citation and attribution guidelines

Why It Works

You can validate your llms.txt using the llms.txt Validator (free). For implementation guidance, read our llms.txt implementation guide.

“Sites with llms.txt files give AI models a clear authoritative source of truth about the brand. Without one, AI models guess from scattered data and often guess wrong.”

GEO Best Practices, 2026

Step 4: Format Content for AI Extraction

AI models do not read content the way humans do. They extract passages, tables, and structured answers. Content formatted for extraction gets cited more often.

Answer-First Paragraphs (BLUF)

Bad: "In today's rapidly evolving landscape, many businesses are wondering about the impact of AI on their marketing strategies. Let us explore this complex topic."

Good: "AI search now handles 22 percent of all searches in 2026. Brands invisible to ChatGPT, Perplexity, and Claude lose discovery they cannot recover through traditional SEO alone."

HTML Tables for Comparison Data

AI models extract HTML tables almost verbatim. Never write comparison data as prose. If you are comparing tools, pricing, or features, put it in a table.

FAQ Sections with Clear Q&A Pairs

FAQ sections serve double duty: they match the question-answer format that AI models prefer, and they provide FAQPage schema that AI can extract directly.

Clear H2/H3 Hierarchy

22%

of all searches handled by AI in 2026, up from 15% in 2025. Growing at 8% month-over-month.

Source: AI Search Statistics, 2026

Step 5: Build Domain Authority (Traditional SEO Still Matters)

What builds domain authority for AI citation:

Backlinks from authoritative domains signal trust to AI models
Consistent publishing on your topic cluster builds topical authority
Third-party mentions (press, reviews, comparisons) give AI models corroborating sources
Reddit and forum presence matters because Reddit is a primary training data source for LLMs

For deeper analysis of how SEO and GEO differ, read our SEO vs GEO guide.

“Domain traffic is the number one predictor of AI citations. High-traffic sites earn 3x more citations. Traditional SEO and AI visibility compound each other.”

AI Citation Research, 2026

Step 6: Monitor and Iterate

Getting cited is not a one-time achievement. AI models update frequently. A citation you have today can disappear next week if a competitor publishes stronger content or the model retrains.

Weekly Monitoring

Run 5 category prompts across ChatGPT, Perplexity, Claude, and Gemini every week. Track whether your brand appears, where it ranks in the response, and what competitors are mentioned.

Monthly Technical Audit

Re-run a Radar audit monthly to catch technical regressions. A hosting migration, CMS update, or robots.txt change can silently break AI access.

Track What Works

For a complete guide to tracking tools and methods, see How to Track AI Citations.

The Complete Checklist

Step	Action	Time to Impact	Tools
1. Crawl Access	Unblock GPTBot, ClaudeBot, PerplexityBot in robots.txt	2-4 weeks	AI Crawl Checker (free)
2. Structured Data	Add Organization, Article, FAQPage, BreadcrumbList schema	2-4 weeks	Schema Audit in Radar
3. llms.txt	Create and publish llms.txt with company info and products	2-4 weeks	llms.txt Validator (free)
4. Content Format	Answer-first paragraphs, tables, FAQs, clear headings	4-8 weeks	AEO Page Auditor (free)
5. Domain Authority	Backlinks, publishing, third-party mentions	Ongoing	Existing SEO tools
6. Monitor	Weekly prompt tests, monthly Radar audits	Ongoing	Radar + manual testing

Getting Cited by AI: Questions Readers Ask

Common questions about this topic, answered.

Start Now

Every day without AI visibility is traffic you are not getting from the highest-converting discovery channel available. The technical setup takes under an hour. The results compound over weeks.

Ready to get cited by AI search engines?

Run a free Radar audit to see your starting position across all 12 dimensions
AI Visibility Strategy if you want us to handle everything ($4,500 sprint)
Read the GEO playbook for the full methodology
Contact us for a 30-minute strategy call

About the Author

Lloyd Pilapil

Founder & AI Product Architect at Pixelmojo

Expertise

Agentic AI SystemsMulti-Agent OrchestrationAX DesignGEO & AI SearchThread-Based EngineeringAI Product DevelopmentGrowth MarketingUI/UX Design

Getting Cited by AI Is a Technical Problem, Not a Content Problem

Step 1: Unblock AI Crawlers

Check Your robots.txt

How to Test

Step 2: Add Structured Data

The 5 Schema Types That Matter Most

Implementation Tips

Step 3: Create an llms.txt File

What to Include

Why It Works

Step 4: Format Content for AI Extraction

Answer-First Paragraphs (BLUF)

HTML Tables for Comparison Data

FAQ Sections with Clear Q&A Pairs

Clear H2/H3 Hierarchy

Step 5: Build Domain Authority (Traditional SEO Still Matters)

Step 6: Monitor and Iterate

Weekly Monitoring

Monthly Technical Audit

Track What Works

The Complete Checklist

Getting Cited by AI: Questions Readers Ask

Start Now

About the Author

Lloyd Pilapil

Related Reading

Getting Cited by AI Is a Technical Problem, Not a Content Problem

Step 1: Unblock AI Crawlers

Check Your robots.txt

How to Test

Step 2: Add Structured Data

The 5 Schema Types That Matter Most

Implementation Tips

Step 3: Create an llms.txt File

What to Include

Why It Works

Step 4: Format Content for AI Extraction

Answer-First Paragraphs (BLUF)

HTML Tables for Comparison Data

FAQ Sections with Clear Q&A Pairs

Clear H2/H3 Hierarchy

Step 5: Build Domain Authority (Traditional SEO Still Matters)

Step 6: Monitor and Iterate

Weekly Monitoring

Monthly Technical Audit

Track What Works

The Complete Checklist

Getting Cited by AI: Questions Readers Ask

Start Now

About the Author

Lloyd Pilapil

Related Reading

Getting Cited by AI Is a Technical Problem, Not a Content Problem

Step 1: Unblock AI Crawlers

Check Your robots.txt

How to Test

Step 2: Add Structured Data

The 5 Schema Types That Matter Most

Implementation Tips

Step 3: Create an llms.txt File

What to Include

Why It Works

Step 4: Format Content for AI Extraction

Answer-First Paragraphs (BLUF)

HTML Tables for Comparison Data

FAQ Sections with Clear Q&A Pairs

Clear H2/H3 Hierarchy

Step 5: Build Domain Authority (Traditional SEO Still Matters)

Step 6: Monitor and Iterate

Weekly Monitoring

Monthly Technical Audit

Track What Works

The Complete Checklist

Getting Cited by AI: Questions Readers Ask

Start Now

About the Author

Lloyd Pilapil

Related Reading

Getting Cited by AI Is a Technical Problem, Not a Content Problem

Step 1: Unblock AI Crawlers