Which sources AI uses to form an opinion about a brand — and why the site is not the only hero

Research question

Which layers, exactly, shape machine opinion about a brand, and why the company’s own website remains important but ceases to be the sole arbiter.

Evidence type

Documents from Google, OpenAI, Microsoft, and Perplexity, as well as survey research on retrieval-augmented generation and on the integration of external knowledge.

Freshness of factual claims

The factual material on search mechanics and answer systems is current as of March 2026.

The website as the primary source, but not the only arbiter

When a brand first encounters the problem of visibility in AI, its first instinct is almost always the same: improve the company website. The instinct is sound, but incomplete. The official website does remain the central carrier of primary information about the company: it is where the brand explains who it is, what it does, how the product works, what the prices are, what the constraints are, which usage scenarios it serves, and what evidence supports its competence. But in the answer environment, the website no longer functions as the sole and uncontested source of truth. It is an important case document, but not the only witness. And the decision about how exactly to restate the brand for the user is increasingly made by AI on the basis of several source types at once.

This follows from the architecture of modern systems itself. Google Search Central states explicitly that AI Overviews and AI Mode use a fan-out decomposition of the query across subtopics and data sources, then surface a broader and more diverse set of supporting links than classic web search [1]. In its AI Mode help documentation, Google adds that the system breaks the question into subtopics and simultaneously looks for relevant material for each of them [2]. OpenAI describes ChatGPT Search as a mechanism for producing fast, up-to-date answers grounded in web sources and informed by the context of the entire conversation [3][4]. Perplexity expresses the same idea with maximum directness: the system searches the internet in real time, gathers information from trustworthy sources, and condenses it into a short explanation [5]. In the research literature, this family of practices is commonly described as a combination of the model’s parametric knowledge and generation with external knowledge retrieval (retrieval-augmented generation) [6][7].

If we translate that technical picture into the language of the brand, the conclusion is simple but important. AI’s opinion about a company is built from at least five layers.

Five layers of the source contour

The first layer is the brand’s owned channels. These include the website, documentation, FAQ sections, product descriptions, pricing pages, case studies, public research, the press center, expert blogs, and, in some cases, video transcripts and technical knowledge bases. This layer defines the base thesaurus: what the brand calls itself, which category it places itself in, and which properties it puts in the foreground. If there is already confusion across the brand’s own channels, no amount of external reputation will save it. The machine needs a starting scaffold.

The second layer is search and link context. Even when the answer shown to the user looks like a conversation, the logic of search infrastructure is often still operating underneath it. Google reminds us that, to participate in AI features, pages must be indexed and broadly suitable for ordinary search [1]. Put simply, the AI intermediary rarely starts from zero: it relies on the preexisting layer of discovery, indexing, and selection of web documents. That is why the site’s technical accessibility, the quality of its text, and basic search discipline still matter. But they no longer guarantee dominance. They merely get the brand into the game.

The third layer is external editorial and industry sources. These include reviews, comparisons, rankings, interviews, analytical materials, trade-media publications, directories, and business profiles. This is usually where the brand gets what it cannot give itself: external validation. If the official website claims that the company is strong in complex enterprise analytics, while independent sources describe it as a niche tool for small business, the answer system has to reconcile those versions. And very often it chooses the version that is better validated and more clearly embedded in the network of links. In answer systems, self-presentation without external verification carries less weight than brands would like.

The fourth layer is the user trace. This includes reviews, forum discussions, questions and answers in communities, mentions on social platforms, opinion catalogs, support pages, and, more broadly, the whole living and not always tidy fabric of the internet in which people explain to one another what a product is and how it works. This layer is noisy and unreliable, but it cannot be ignored. It often shapes the language of actual demand. A company may describe itself as a “modular environment for intelligent data management,” while users discuss it as “a convenient service for complex reporting without a heavy implementation.” For AI, that language matters a great deal, because it is the language in which everyday questions are actually phrased.

The fifth layer is structured knowledge. This includes entity databases, open knowledge graphs, catalogs, business directories, organization profiles, standardized descriptions, and, sometimes, schema markup on the site itself. Survey work on integrating external knowledge into language models shows that linking AI to knowledge bases and graphs improves the factual accuracy, traceability, and explainability of the answer [6][8]. For the brand, this means that the role of “boring” and formal-looking sources increases. They rarely create a vivid reputation, but they often provide stable identification of the entity.

Why answer platforms read this environment differently

It is precisely the combination of these layers that explains why the website does not become the main character. It may be the main primary source, but not the main arbiter. Answer systems assess not only what the brand claims about itself, but also how that claim is validated, repeated, challenged, or reformulated by other participants in the network. Put more sharply: the website explains what the brand would like to be seen as; the external environment shows what it is actually seen as; and AI tries to assemble a workable compromise between those versions.

Several practically important consequences follow from this.

First, it is impossible to work seriously on visibility in AI if you limit yourself to the homepage. Even a brilliantly written website does not guarantee that the brand will be named in the answer if external sources either fail to validate its key properties or validate them differently. Second, the official website remains critically important — precisely because it defines the canonical structure of the entity. But its function changes. It must be not only an attractive storefront, but also a reliable point of alignment: a place where AI and humans can see the same name, category, properties, and evidence with equal clarity. Third, the brand has to manage not only its own text, but also the ecosystem of validation: who writes about it and how, which comparisons it appears in, which catalogs and knowledge bases it is present in, where its methodology is represented, and who can independently validate its role in the market.

From editing the website to managing the entire knowledge contour

What matters especially is that different AI platforms read this environment differently. Google relies on its own search infrastructure and AI modes, where indexability and page eligibility for display matter [1][2]. ChatGPT Search brings in web sources either on request or automatically, while taking the dialogue context into account [3][4]. Perplexity emphasizes almost continuous real-time web retrieval and explicit links [5]. Microsoft Copilot likewise describes its answers as grounded in web search and external links [9][10]. For a brand, this means there is no single “source of truth” from which every machine will read the company in the same way. There is a network of sources that each system assembles according to its own logic.

That is why a mature strategy begins with a more mature question. Not “how should we describe ourselves better on the website?” but “what set of sources forms machine opinion about us — and where in that set are we strong, and where are we being undermined by noise, absence, or someone else’s interpretation?” Only after that question does content work stop being cosmetic and become knowledge management.

This is exactly where the brand’s new role on the internet comes into view. It used to be able to think of the website as the main stage, and everything else as noisy background. Now the picture flips. The site remains the stage, but the performance has long since stopped unfolding only there. The whole internet stages it. And the answer system acts not as a spectator, but as an editor, assembling the final version for the user out of a multitude of voices. In that logic, the winner is not the one that speaks most loudly about itself, but the one whose entity is validated most clearly and consistently across the network.

What seems well established

It is well established that answer systems use not one document and not one type of signal. For a stable presence, a brand needs a set of aligned sources, not merely a strong homepage.

What still remains uncertain

The exact relative importance of each layer — the website, external media, reviews, catalogs, knowledge graphs — varies from platform to platform and is rarely disclosed in full.

What this changes in practice

The practical conclusion is straightforward: what must be managed is the entire source contour. An audit of visibility in AI begins with a source map, not with editing a single paragraph on the website.

Sources

[1] Google Search Central. AI Features and Your Website. 2026

[2] Google Search Help. Get AI-Powered Responses with AI Mode in Google Search. 2026

[3] OpenAI. Introducing ChatGPT Search. 2024

[4] OpenAI Help Center. ChatGPT Search. 2026

[5] Perplexity Help Center. How does Perplexity work? 2026

[6] Yu H. et al. Evaluation of Retrieval-Augmented Generation: A Survey. 2024

[7] Zhao P. et al. Retrieval-Augmented Generation for AI-Generated Content: A Survey. Data Science and Engineering, 2026

[8] Ibrahim N. et al. A Survey on Augmenting Knowledge Graphs with Large Language Models. Discover Artificial Intelligence, 2024

[9] Microsoft. Copilot Search in Bing. 2026

[10] Microsoft Support. Understanding Web Search in Microsoft 365 Copilot Chat. 2026

Related materials

Next step

How the report measures web-source strength

In web-augmented mode the model relies on external documents. AI100 separately calculates web boost — how much the answer changes when the system gets access to the live internet, and how often it cites the brand's domain.

See how web boost is calculated →