What we measure

AI100 measures how naturally a brand appears in neutral AI answers within its category and region. The methodology separates the main score layer (neutral scenarios) from the diagnostic layer (branded queries) and uses a nonlinear 0–100 scale.

Unit of measurement: one model answer to one standardized question scenario.

How a run works

1. Framing the study

First we read the site, infer the category, and clarify which market frame makes sense for comparison. The user selects a Visibility Language — the language in which the model will be queried. This is an important parameter: the same brand may encounter a different competitive landscape depending on the prompt language. The model assembles a separate associative field for each language: brands that dominate in one language may yield their position to other competitors in another. For international brands a separate study is recommended for each target-market language.

2. Building the question corpus

Then we collect the scenario set: some questions test natural category visibility, while others help explain reputation and answer style.

3. Calculating the core score

The main score uses only neutral scenarios where the brand still has to earn its place through the answer itself. Separately we calculate a diagnostic score (from direct brand mentions), web lift (the gap between memory-only and search-augmented answers), and a confidence interval for the result.

4. Explanation and report

Finally we turn the answer set into a readable report: the score, its stability, the brand's strengths, and the clearest growth zones.

How the score is calculated and read

The jump from weak visibility to a credible middle layer feels dramatic: a brand either barely exists for the model or already appears in part of the answers. The jump from strong visibility to near-domination is harder. That is why we use a logarithmic transformation.

S = 100 ×  ln(1 + r / 12) ln(1 + 100 / 12)
S — final score (0–100) r — raw visibility score (0–100) 12 — softener (calibration parameter)
0 25 50 75 100 0 25% 50% 75% 100% Raw visibility score (r) Score linear 25% → 50 points
What raw means. It is the raw visibility signal: how often the brand appears, how high it holds, and how convincing it looks across the neutral scenario set.
Why we use a logarithm. The logarithm makes the lower and middle parts of the scale more sensitive. A few lucky answers therefore do not turn too quickly into a high final score.
How to read the result. A move from 20 to 40 reflects a real gain in presence. A move from 80 to 90 matters too, but it is much harder to achieve — and that is exactly the effect the nonlinear scale is designed to preserve.
Confidence interval. Each score comes with a confidence interval — the range within which the score would likely fall if the same corpus of questions were run again. A narrow interval means stable visibility; a wide one means the brand's presence fluctuates across scenarios.
Web lift. The study runs in two modes: model knowledge only and model + web sources. The difference between the two scores is reported as web lift. A positive value means web sources strengthen the brand; a negative value means they weaken it.

Corpus and scoring

Core layer

Family What it checks
ExpertiseDoes the model see authority signals in the brand's domain
Comparison of optionsDoes the brand hold up in comparative questions without name prompting
Customer constraintsQuestion family inside the core corpus.
Customer ExpertQuestion family inside the core corpus.
Customer explorationQuestion family inside the core corpus.
Customer job-to-be-doneQuestion family inside the core corpus.
Customer MigrationQuestion family inside the core corpus.
Customer PainQuestion family inside the core corpus.
Customer trade-offsQuestion family inside the core corpus.
Solution discoveryDoes the model name the brand when the user is just starting to search
Ranked listingsHow high does the model place the brand in an explicit category ranking
ShortlistDoes the brand make the shortlist when the user is ready to compare
TrustDoes the model associate the brand with reliability and sound choice

Core score weights

Metric What it shows Weight
Mention RateHow often the brand appears in answers28.0%
Top-3 RateHow often the brand is in the top part of the answer14.0%
Top-1 RateHow often the brand is named first10.0%
Avg PositionAverage brand position across answers15.0%
Prompt CoverageIn what share of scenarios the brand appears18.0%
Response ShareHow often the brand is mentioned in answer text10.0%
Text ShareWhat share of answer text is about the brand5.0%

Diagnostic layer

This layer does not replace the main score. It explains what happens when the brand is already named, directly compared, or discussed in terms of reputation.

Family What it checks
Alternative choicesIs the brand recalled as an alternative to an already named solution
Branded reputationHow the model describes the brand when the name is already given
Head-to-head comparisonWhat happens in a head-to-head comparison with a competitor

Diagnostic score weights

Metric What it shows Weight
Recommendation RateShare of answers with explicit brand recommendation30.0%
Recommendation StrengthHow convincingly the model phrases the recommendation25.0%
CentralityWhether the brand is the main topic of the answer20.0%
Positive ToneShare of answers with explicitly positive tone15.0%
Argument QualityWhether the model supports the recommendation with arguments10.0%

Scope and limitations

AI100 runs the same corpus of scenarios through six models from four independent families: GPT-5.3 chat and GPT-5.4 mini (OpenAI), Gemini 2.5 Pro and Gemini 2.5 Flash (Google), Grok 4.1 Fast (xAI), and DeepSeek V3.2. Every model answers in two modes: relying on its internal knowledge only, and with web source augmentation. The final score aggregates answers from all six models — this reduces dependence on any single model's quirks.

These six models cover approximately 93% of free AI assistant users worldwide. The set is fixed and identical for every client: everyone receives the same cross-model measurement, so results across brands can be compared directly. Microsoft Copilot is covered automatically through the OpenAI slots (Copilot uses GPT-5.x in production).

What AI100 measures

  • How naturally the brand appears in neutral AI answers within its category.
  • How high the brand holds in the answer and whether web sources strengthen it.
  • Which question families make the brand disappear and where it looks stronger than competitors.

What AI100 does not measure

  • Sales, conversion, marketing-team strength, or product quality in themselves.
  • Every language model that exists. AI100 fixes a pool of six models covering approximately 93% of free AI assistant users worldwide — enough for reliable measurements of mass-market brand visibility, but not for conclusions about specific niche models.
  • An absolute truth about the market. Any measurement depends on the date, the language, the category, and the question corpus.

Methodology history and roadmap

The AI100 methodology evolves in versions. Here is how the formula has changed and what is planned next.

Revision log

Version Date What changed
v2026.04 April 2026 Main formula moved to 7 metrics; opportunity-map quality reserve recalculated.
v2026.03 March 2026 Diagnostic layer over branded queries introduced as a separate rating.
v2026.02 February 2026 Switched to a pool of six independent models from different families; cross-model analysis introduced.
v2026.01 January 2026 Bootstrap iterations for the confidence interval increased from 100 to 300.

Roadmap

Period Focus
Q2 2026
  • Locking the competitive set between repeat audits of a brand for honest comparison of share metrics
Q3 2026
  • Repeat runs to measure within-language and cross-language variance
  • Cross-model analysis extended to additional model families
Later
  • Distribution ecosystems: how models rely on Reddit, YouTube, GitHub and app stores
  • Longitudinal tracking of a single brand over time

Want to see what it looks like for a real brand?

View sample report