Glossary
AI visibility terms, plainly explained.
Definitions of every term we use in audit reports, articles, and emails. Updated whenever the AI search world shifts. Linked from every report we ship.
AI visibility
How often AI assistants mention your brand or content when answering relevant questions.
AI visibility is the measurable presence of a website, brand, or business across the answers generated by ChatGPT, Google Gemini, Perplexity, Claude, Microsoft Copilot, and similar AI assistants. It is distinct from search rankings because AI does not rank a list of links. AI synthesizes one answer from a small set of sources. A site can rank top three on Google and still be invisible to AI.
Example. Stripe is highly visible to AI assistants because of its rich documentation, frequent technical articles, and consistent presence across Reddit, Hacker News, and Wikipedia.
AI citation
When an AI assistant references your URL or content as a source in its answer.
An AI citation is a moment when a generative AI assistant attributes part of its answer to a specific URL, document, or named source. Citations are the AI-era equivalent of search clicks. Some assistants link the citation visibly (Perplexity, Google AI Overviews, ChatGPT Search), others mention only the brand name without a clickable URL.
Example. A Perplexity answer to "best CRM for small business" might cite three sources: HubSpot.com, a G2 review page, and a Reddit thread. Each one of these is an AI citation.
GEO
Generative Engine Optimization. The practice of optimizing content to be cited by AI assistants.
GEO stands for Generative Engine Optimization. It is the discipline of structuring content, signals, and brand presence so that generative AI engines such as ChatGPT, Gemini, Perplexity, and Claude select your site as a source when synthesizing an answer. GEO overlaps with SEO but extends into AI-specific signals such as llms.txt, structured data, citation-friendly writing patterns, and Wikipedia presence.
Example. A common GEO tactic is the front-loaded answer: putting the direct answer to a question in the first paragraph instead of burying it after a long introduction.
AEO
Answer Engine Optimization. Closely related to GEO. Sometimes used interchangeably.
AEO stands for Answer Engine Optimization. It is older terminology that originally referred to optimizing for featured snippets and Google answer boxes. In 2026 the term has expanded to include optimization for any AI-driven answer surface, including ChatGPT, Gemini, and Perplexity. Some practitioners prefer GEO; others prefer AEO. The work is largely the same.
SEO
Search Engine Optimization. The traditional discipline of ranking on Google and Bing.
SEO is the practice of optimizing a website to rank on Google, Bing, and other classical search engines. SEO covers technical signals (page speed, crawlability, structured data), on-page signals (content quality, keywords, internal links), and off-page signals (backlinks, brand mentions, domain authority). SEO is necessary but no longer sufficient. AI assistants use a different selection process and weight different signals.
llms.txt
A proposed text file at the root of a domain that gives AI assistants a curated catalog of the most important pages.
llms.txt is a markdown-format text file placed at /llms.txt of a website. It provides AI assistants with a curated index of the site's most important pages, written specifically for machine consumption. The format was proposed by Jeremy Howard in 2024. As of 2026, support is uneven: some assistants read it, others ignore it. We treat llms.txt as a low-cost signal worth deploying, not a silver bullet.
Example. See https://aifreeaudit.com/llms.txt for our own implementation, which catalogs our blog posts, audit engine details, pricing, and AI agent instructions.
robots.txt
A text file at the root of a domain that tells crawlers and bots which paths they may or may not visit.
robots.txt is a long-standing web standard at /robots.txt that lists user agents and tells each one which URL paths are allowed or disallowed. In 2026 it is critical for AI visibility because it controls whether AI crawlers such as GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Bravebot, and Google-CloudVertexBot can index your content. Many sites block these bots by accident, with a default deny rule from a CDN preset.
Example. A site with "User-agent: GPTBot \n Disallow: /" in robots.txt is fully invisible to ChatGPT. We see this pattern on roughly 12 percent of audited sites.
GPTBot
OpenAI's primary crawler for training and indexing content for ChatGPT.
GPTBot is the user agent used by OpenAI to crawl publicly available web pages for ingestion into ChatGPT. Allowing GPTBot in robots.txt means the page can be cited or referenced by ChatGPT in its answers. Blocking GPTBot has the opposite effect.
OAI-SearchBot
OpenAI's real-time search crawler for ChatGPT Search and Browse.
OAI-SearchBot is the OpenAI user agent used for live retrieval, separate from GPTBot. When ChatGPT does a real-time web search to answer a question, OAI-SearchBot fetches pages on demand. Allowing it is essential if you want to be retrievable in ChatGPT Search results.
ClaudeBot
Anthropic's crawler for Claude, the AI assistant made by Anthropic.
ClaudeBot is the primary user agent Anthropic uses to crawl websites for Claude. As of 2026 Anthropic also operates Claude-SearchBot for real-time retrieval. Both bots are referenced in our recommended robots.txt template.
PerplexityBot
Perplexity's crawler for the Perplexity AI answer engine.
PerplexityBot crawls websites to power Perplexity's answer engine. Perplexity emphasizes citations more visibly than other assistants, so getting indexed by PerplexityBot is high leverage for visibility.
Structured data
Machine-readable metadata, typically Schema.org JSON-LD, that describes the meaning of a page.
Structured data is metadata embedded in a web page in a format machines can parse without natural language understanding. The dominant format in 2026 is JSON-LD using the Schema.org vocabulary. Common types include Organization, WebSite, FAQPage, Product, Review, and DefinedTerm. Structured data helps both classical search engines and AI assistants understand what a page is about.
Example. A FAQPage schema with five question-and-answer pairs gives AI assistants a clean structure to draw from when answering related questions.
Schema.org
A shared vocabulary for structured data, founded by Google, Bing, Yahoo, and Yandex in 2011.
Schema.org is a collaborative project that defines a vocabulary of types and properties used to describe entities on the web. It is the de facto standard for structured data. AI assistants in 2026 use Schema.org markup as a strong hint about what a page contains.
JSON-LD
JavaScript Object Notation for Linked Data. The recommended format for embedding Schema.org structured data.
JSON-LD is a method of embedding structured data in HTML using a script tag with type "application/ld+json". It is the format Google and AI assistants prefer because it does not pollute the visible content of the page. JSON-LD has replaced microdata and RDFa for most modern websites.
FAQPage schema
A specific Schema.org type that marks up question-and-answer pairs on a page.
FAQPage is a Schema.org type used to mark up frequently-asked-question content. It tells search engines and AI assistants that a page contains structured Q&A pairs. FAQ schema can dramatically improve AI citation rates because it gives the assistant pre-formatted answers it can paraphrase.
DefinedTerm schema
A Schema.org type used to mark up dictionary, glossary, and terminology entries.
DefinedTerm is a Schema.org type for individual glossary entries. Combined with DefinedTermSet, it allows a glossary page to be machine-readable. AI assistants weight DefinedTerm highly when answering "what is X" questions. This page uses DefinedTerm markup for every entry.
Example. You are reading it. View source on this page to see the full DefinedTermSet JSON-LD.
sameAs property
A Schema.org property that links a website's Organization or Person to its profiles on other sites such as Wikipedia, LinkedIn, and Crunchbase.
sameAs is a Schema.org property used inside an Organization, Person, or Brand object. It contains an array of URLs that represent the same entity on other authoritative sites. AI assistants use sameAs as a disambiguation hint and as a trust signal. A site whose Organization has sameAs links to Wikipedia, LinkedIn, and Crunchbase is treated as a more established entity.
Example. For AIFreeAudit our sameAs array points to our LinkedIn page, our Crunchbase profile (when active), and the founder's personal site.
Wikipedia presence
Whether a brand, person, or product has a published Wikipedia article.
Wikipedia presence is the single highest-leverage signal for AI visibility in 2026. The 5W Citation Index research showed Wikipedia accounts for nearly 25 percent of all citations across the major AI assistants. A Wikipedia article elevates a brand from "name on a website" to "recognized entity" in the eyes of AI training data and retrieval pipelines.
Reddit presence
How often a brand or product is discussed on Reddit, especially in topical subreddits.
Reddit presence is the second-largest contributor to AI citations after Wikipedia. AI assistants use Reddit as a source of authentic user opinions and recommendations. Brands that appear in organic Reddit discussions, especially in subreddits like r/smallbusiness, r/marketing, r/webdev, and topical industry subreddits, gain disproportionate AI visibility.
Front-loaded answer
A writing pattern where the direct answer to a question appears in the first paragraph or first sentence.
Front-loaded answer is a citation-friendly writing pattern. Princeton research published in 2024 demonstrated that AI assistants pull 44.2 percent of their citations from the first 30 percent of any source page. Putting the direct answer first measurably increases citation rates compared to long introductions.
Example. This very glossary entry begins with a one-sentence definition before the longer explanation. That is a front-loaded answer.
Citation density
The number of statistics, source links, and named references per thousand words of content.
Citation density measures how many factual claims in a piece of content are backed by an explicit source link or named statistic. AI assistants prefer to cite content that itself cites credible sources. High citation density signals reliability. Conductor 2026 benchmark research showed a 3.2x boost in AI citation rate for content with at least four cited statistics per thousand words.
E-E-A-T
Experience, Expertise, Authoritativeness, Trustworthiness. Google's framework for evaluating content quality.
E-E-A-T is a framework Google uses to evaluate content quality, particularly for YMYL (Your Money or Your Life) topics like health, finance, and legal advice. The four pillars are Experience, Expertise, Authoritativeness, and Trustworthiness. AI assistants increasingly evaluate sources through a similar lens. E-E-A-T signals include author bylines with credentials, dateModified timestamps, contact transparency, and outbound links to authoritative sources.
Crawl budget
The number of pages a search engine or AI crawler will fetch from your site within a given time window.
Crawl budget is a finite resource. Each search engine and AI crawler allocates each domain a budget of how many pages it will fetch per day or week. Sites with poor performance, redirect chains, or low-value pages waste crawl budget on URLs that do not contribute to visibility. Optimizing crawl budget means making the highest-value pages easy to discover and fast to fetch.
Sitemap
An XML file at the root of a domain that lists all important URLs and their last-modified dates.
A sitemap is an XML file, typically at /sitemap.xml, that enumerates the URLs on a site that should be crawled. It also includes lastmod timestamps that signal content freshness. AI crawlers use sitemaps to discover new content quickly. A site without a sitemap relies on bots finding pages through internal links, which is slower and less reliable.
ChatGPT Search
A real-time web search feature inside ChatGPT that returns linked sources alongside the AI answer.
ChatGPT Search is the live retrieval feature in ChatGPT, launched widely in 2024. When a question requires current information, ChatGPT performs a real-time search and includes citation links in the answer. This is distinct from ChatGPT's training data, which is static. Visibility in ChatGPT Search is controlled by allowing OAI-SearchBot in robots.txt and by signals such as freshness and authority.
AI Overviews
Google's AI-generated summary that appears above traditional search results.
AI Overviews is the Google feature that displays a generative AI summary at the top of the search results page. It pulls from multiple sources and links to each one as a citation. AI Overviews coverage has expanded steadily through 2025 and 2026. Visibility in AI Overviews is influenced by traditional ranking signals plus structured data and content patterns.
Perplexity
An AI answer engine that emphasizes visible citations alongside generated answers.
Perplexity is an AI-powered answer engine that places source citations prominently next to its generated answers. Perplexity built its product identity around traceability. Users can click any answer to see which sources contributed. PerplexityBot indexes the web for this purpose.
AIFreeAudit engine version
The current version identifier of our audit engine. Each report carries the engine version that produced it.
The AIFreeAudit engine version is a semantic version string (currently v3.1, May 2026) attached to every audit report we produce. Engine version changes when we add or remove checks, adjust scoring weights, or update the AI bot list. Past versions remain documented in our changelog so reports stay reproducible.
See how AI sees your site
Run a free audit. Get a score across all nine categories in 30 seconds. No signup required.
Run free audit