Narrative Contamination & Risk Scoring

Narrative contamination occurs when false or outdated information about your brand spreads from one AI system to others. A single hallucination in ChatGPT can contaminate Claude, Gemini, and other models if that false information enters their training data.

How Contamination Happens

The Contamination Cycle

Week 1: One engine mentions false claim
  ChatGPT: "TruthVouch was founded in 2018"
  (Source: old blog post in training data)

Week 2-3: Other engines pick it up
  Claude crawls a webpage that quoted ChatGPT's response
  Now Claude: "Founded in 2018"

Week 4-5: Spreads further
  Gemini's training data includes a forum post citing ChatGPT
  Now Gemini: "Founded in 2018"

Week 6+: Entrenched
  5 engines now repeat the false claim
  It's harder to correct because multiple "sources" now exist

Why It Happens

Training data includes AI responses
- LLM training data includes web content
- Web content increasingly includes AI outputs
- AI systems cite each other
- False information spreads through this citation cycle
Web sources are authoritative
- If false info appears on 3+ websites, AI systems treat it as fact
- A single hallucination can create “evidence” by being quoted
No self-correction
- AI systems don’t fact-check each other
- No mechanism to say “that other AI was wrong”
- Each system independently learns from training data

Contamination Risk Scoring

Brand Intelligence scores the contamination risk for each inaccurate claim:

Contamination Risk = (Prevalence × 0.40) + (Growth Rate × 0.35) + (Source Authority × 0.25)

Prevalence:
- 1 engine: 20% risk
- 2 engines: 40% risk
- 3 engines: 60% risk
- 4+ engines: 80%+ risk

Growth Rate:
- Stable (not spreading): Low risk
- Growing 1 engine/week: Medium risk
- Growing 2+ engines/week: High risk

Source Authority:
- Obscure web source: Low risk
- Major publication: High risk
- Your website: Very high risk

Risk Levels

LOW RISK (0-30%)

Characteristics:

Inaccuracy mentioned by only 1 engine
Not growing (no other engines picking it up)
Source is obscure

Example:

ChatGPT says “Founded in 2019” (but you founded in 2020)
No other engine mentions this
Source is a single old blog post

Action: Monitor for 2 weeks. If it doesn’t spread, low priority.

MEDIUM RISK (30-60%)

Characteristics:

Inaccuracy mentioned by 2-3 engines
Starting to grow
Or source is semi-authoritative

Example:

ChatGPT says “500 customers”
Claude picked it up (now 2/5 engines)
Growing at 1 engine per week

Action: Prioritize fixing. Counter-messaging needed within 2 weeks.

HIGH RISK (60-80%)

Characteristics:

Inaccuracy mentioned by 3+ engines
Growing rapidly (2+ engines per week)
Multiple sources

Example:

“Founded in 2018” mentioned by 4 engines
Gained 2 engines this week
Now appearing on 5+ websites quoting each other

Action: Urgent. Major counter-messaging and website updates required.

CRITICAL RISK (80%+)

Characteristics:

Mentioned by 4+ engines
Rapidly spreading
High-authority sources (news, Wikipedia, major websites)

Example:

False claim about your product capabilities
Mentioned by 5/5 monitored engines
In major publication or Wikipedia
Growing daily

Action: Crisis mode. Immediate intervention needed.

Detecting Early Contamination

Brand Intelligence alerts you when contamination risk increases:

Alert Type: “Contamination Risk Rising”
Trigger: Same inaccuracy spreads to a 2nd engine (30% risk threshold)
Action: You’re notified immediately

Example alert:

CONTAMINATION ALERT
Claim: "TruthVouch was founded in 2018"

Week 1: ChatGPT (1 engine, 20% risk) → No alert
Week 2: Claude picks it up (2 engines, 40% risk) → ALERT SENT

Risk Level: MEDIUM
Current Spread: 2 engines
Growth Rate: +1 engine per week at current rate

Recommended Action:
1. Create content clarifying founding year
2. Update relevant website pages
3. Monitor weekly for further spread

Contamination vs. Accuracy

Different but related concepts:

Accuracy:

Does AI say something true about you?
Measured per-claim
Updated weekly as AI systems change

Contamination:

Is a false claim spreading?
Measured by count of engines affected
Growing threat over time

Example:

Claim: "Founded in 2019"
Actual: "Founded in 2020"

Week 1 Accuracy: 60/100 (1 engine wrong, 4 right)
Week 1 Contamination: LOW (only 1 engine affected)

Week 3 Accuracy: 40/100 (3 engines wrong, 2 right)
Week 3 Contamination: MEDIUM (spreading to 3 engines)

Week 6 Accuracy: 0/100 (5 engines wrong)
Week 6 Contamination: CRITICAL (all engines wrong)

Preventing Contamination

Immediate (Stop Spread)

Publish corrected information on your website
- Clear, prominent statement of correct fact
- Include date when you made the correction
- Example: “TruthVouch was founded in April 2020. [Some sources incorrectly state 2018; this was corrected January 2024.]”
Create news/press release
- Official source documenting correction
- Gives AI systems newer, authoritative source
Update public profiles
- LinkedIn company page
- Wikipedia (if listed)
- Crunchbase
- Any directory listing you control

Short-term (Reduce Impact)

Monitor spread weekly
- Is the false claim still growing?
- Check back every Monday
- Alert if growth continues
Create counter-content
- Blog post: “Clarifying Our Founding Story”
- FAQ: “When was TruthVouch founded?”
- Makes your correct version more visible in search/AI
Reach out to sources
- If false info is on a major website you know about, contact them
- Request correction
- Most will update if you provide evidence

Long-term (Replace False Narrative)

Wait for model updates
- AI models retrain periodically
- ChatGPT: Every 3-4 months
- Claude: Every 3-6 months
- Gemini: Every 2-3 months
- When they retrain with new training data (including your corrected website), the false claim should disappear
Publish more content
- The more you write about your founding, the better
- AI systems weight more recent, authoritative sources heavier
- Your website should dominate results
Build backlinks
- Quality external links to your website increase its authority
- AI training data gives more weight to linked content

Tracking Contamination Over Time

Navigate to Brand Intelligence → Contamination Risk for:

Contamination Timeline: Tracks each false claim’s spread
Trend Chart: Shows prevalence over time (growing/stable/declining)
Per-Engine Breakdown: Which engines have the false claim
Source Analysis: Where the false claim originated

Example Timeline:

Claim: "AI Visibility Score affects search rankings"
[This is FALSE; GEO measures AI discovery, not SEO]

Timeline:
Jan 15: ChatGPT mentions this (1 engine, 20% risk)
Jan 20: Perplexity mentions it (2 engines, 40% risk)
Jan 25: Claude mentions it (3 engines, 60% risk) ← HIGH RISK ALERT
Jan 30: Gemini mentions it (4 engines, 80% risk) ← CRITICAL ALERT
Feb 5:  Still 4 engines (stable, 80% risk) → Plateau

Action Timeline:
Jan 20: You published blog clarifying the difference
Jan 25: Contacted sources that quote the false claim
Feb 1:  Created FAQ "GEO vs SEO: What's the Difference?"
Feb 5:  Noticed growth has stopped
Feb 15: Gemini updated and dropped the false claim (back to 3 engines)
Mar 1:  All engines corrected (back to 20% or lower)

Managing Critical Contamination

If a claim reaches CRITICAL risk (80%+):

Immediate Actions (Today)

Create official correction on your homepage
Send media inquiry to major sources citing the false claim
Post on social media with correction
Notify your team to mention correction in customer conversations

Short-term (This Week)

Publish detailed blog post explaining correct info
Update all website pages that might perpetuate the false claim
Create FAQ addressing the false claim directly
Monitor daily for further spread

Medium-term (This Month)

Reach out to Wikipedia (if listed) to correct entry
Contact major publications that cited false claim
Build PR campaign around correct narrative
Implement schema markup on your website to help AI systems understand your facts

Case Studies

Case 1: Prevented Contamination (Quick Action)

False Claim: "TruthVouch charges per API call"
Actual: "TruthVouch charges per month, flat rate"

Week 1: ChatGPT mentions pay-per-call pricing
Action: You immediately publish "Our Pricing Model" blog post
Week 2: Claude picks it up, but your blog post appears at top of search
Result: Other engines cite your blog instead of ChatGPT
Outcome: Contamination stopped at 2 engines before spreading further

Case 2: Late Intervention (Harder Fix)

False Claim: "TruthVouch is expensive"
Actual: Competitive pricing vs. alternatives

Week 1: ChatGPT mentions pricing as high
Week 2: Claude and Gemini pick it up
Week 3: Perplexity mentions it (now 4/5 engines)
Week 4: You create "ROI Calculator" tool

Result: Your ROI content helps, but claims already entrenched
Timeline to full recovery: 3-4 months (waiting for model updates)
Lesson: Act at week 1, not week 4

Tools for Contamination Management

In TruthVouch dashboard:

Contamination Monitoring

Automated tracking of claim spread
Alerts when claims reach medium/high risk
Per-engine tracking
Source attribution

Recommended Actions

Auto-generated content suggestions
FAQ templates addressing false claims
Blog post outlines
Counter-messaging frameworks

Risk Forecasting

Trend projection: “At current rate, will reach 5 engines in X weeks”
Growth rate monitoring
Early warnings

Next Steps

Claim Extraction → — Understand where claims come from
Narrative Clustering → — How claims form narratives
Dashboard Alerts → — Monitor contamination in real-time
Accuracy Score → — Understand overall accuracy impact