Narrative Contamination & Risk Scoring
Narrative contamination occurs when false or outdated information about your brand spreads from one AI system to others. A single hallucination in ChatGPT can contaminate Claude, Gemini, and other models if that false information enters their training data.
How Contamination Happens
The Contamination Cycle
Week 1: One engine mentions false claim ChatGPT: "TruthVouch was founded in 2018" (Source: old blog post in training data)
Week 2-3: Other engines pick it up Claude crawls a webpage that quoted ChatGPT's response Now Claude: "Founded in 2018"
Week 4-5: Spreads further Gemini's training data includes a forum post citing ChatGPT Now Gemini: "Founded in 2018"
Week 6+: Entrenched 5 engines now repeat the false claim It's harder to correct because multiple "sources" now existWhy It Happens
-
Training data includes AI responses
- LLM training data includes web content
- Web content increasingly includes AI outputs
- AI systems cite each other
- False information spreads through this citation cycle
-
Web sources are authoritative
- If false info appears on 3+ websites, AI systems treat it as fact
- A single hallucination can create “evidence” by being quoted
-
No self-correction
- AI systems don’t fact-check each other
- No mechanism to say “that other AI was wrong”
- Each system independently learns from training data
Contamination Risk Scoring
Brand Intelligence scores the contamination risk for each inaccurate claim:
Contamination Risk = (Prevalence × 0.40) + (Growth Rate × 0.35) + (Source Authority × 0.25)
Prevalence:- 1 engine: 20% risk- 2 engines: 40% risk- 3 engines: 60% risk- 4+ engines: 80%+ risk
Growth Rate:- Stable (not spreading): Low risk- Growing 1 engine/week: Medium risk- Growing 2+ engines/week: High risk
Source Authority:- Obscure web source: Low risk- Major publication: High risk- Your website: Very high riskRisk Levels
LOW RISK (0-30%)
Characteristics:
- Inaccuracy mentioned by only 1 engine
- Not growing (no other engines picking it up)
- Source is obscure
Example:
- ChatGPT says “Founded in 2019” (but you founded in 2020)
- No other engine mentions this
- Source is a single old blog post
Action: Monitor for 2 weeks. If it doesn’t spread, low priority.
MEDIUM RISK (30-60%)
Characteristics:
- Inaccuracy mentioned by 2-3 engines
- Starting to grow
- Or source is semi-authoritative
Example:
- ChatGPT says “500 customers”
- Claude picked it up (now 2/5 engines)
- Growing at 1 engine per week
Action: Prioritize fixing. Counter-messaging needed within 2 weeks.
HIGH RISK (60-80%)
Characteristics:
- Inaccuracy mentioned by 3+ engines
- Growing rapidly (2+ engines per week)
- Multiple sources
Example:
- “Founded in 2018” mentioned by 4 engines
- Gained 2 engines this week
- Now appearing on 5+ websites quoting each other
Action: Urgent. Major counter-messaging and website updates required.
CRITICAL RISK (80%+)
Characteristics:
- Mentioned by 4+ engines
- Rapidly spreading
- High-authority sources (news, Wikipedia, major websites)
Example:
- False claim about your product capabilities
- Mentioned by 5/5 monitored engines
- In major publication or Wikipedia
- Growing daily
Action: Crisis mode. Immediate intervention needed.
Detecting Early Contamination
Brand Intelligence alerts you when contamination risk increases:
- Alert Type: “Contamination Risk Rising”
- Trigger: Same inaccuracy spreads to a 2nd engine (30% risk threshold)
- Action: You’re notified immediately
Example alert:
CONTAMINATION ALERTClaim: "TruthVouch was founded in 2018"
Week 1: ChatGPT (1 engine, 20% risk) → No alertWeek 2: Claude picks it up (2 engines, 40% risk) → ALERT SENT
Risk Level: MEDIUMCurrent Spread: 2 enginesGrowth Rate: +1 engine per week at current rate
Recommended Action:1. Create content clarifying founding year2. Update relevant website pages3. Monitor weekly for further spreadContamination vs. Accuracy
Different but related concepts:
Accuracy:
- Does AI say something true about you?
- Measured per-claim
- Updated weekly as AI systems change
Contamination:
- Is a false claim spreading?
- Measured by count of engines affected
- Growing threat over time
Example:
Claim: "Founded in 2019"Actual: "Founded in 2020"
Week 1 Accuracy: 60/100 (1 engine wrong, 4 right)Week 1 Contamination: LOW (only 1 engine affected)
Week 3 Accuracy: 40/100 (3 engines wrong, 2 right)Week 3 Contamination: MEDIUM (spreading to 3 engines)
Week 6 Accuracy: 0/100 (5 engines wrong)Week 6 Contamination: CRITICAL (all engines wrong)Preventing Contamination
Immediate (Stop Spread)
-
Publish corrected information on your website
- Clear, prominent statement of correct fact
- Include date when you made the correction
- Example: “TruthVouch was founded in April 2020. [Some sources incorrectly state 2018; this was corrected January 2024.]”
-
Create news/press release
- Official source documenting correction
- Gives AI systems newer, authoritative source
-
Update public profiles
- LinkedIn company page
- Wikipedia (if listed)
- Crunchbase
- Any directory listing you control
Short-term (Reduce Impact)
-
Monitor spread weekly
- Is the false claim still growing?
- Check back every Monday
- Alert if growth continues
-
Create counter-content
- Blog post: “Clarifying Our Founding Story”
- FAQ: “When was TruthVouch founded?”
- Makes your correct version more visible in search/AI
-
Reach out to sources
- If false info is on a major website you know about, contact them
- Request correction
- Most will update if you provide evidence
Long-term (Replace False Narrative)
-
Wait for model updates
- AI models retrain periodically
- ChatGPT: Every 3-4 months
- Claude: Every 3-6 months
- Gemini: Every 2-3 months
- When they retrain with new training data (including your corrected website), the false claim should disappear
-
Publish more content
- The more you write about your founding, the better
- AI systems weight more recent, authoritative sources heavier
- Your website should dominate results
-
Build backlinks
- Quality external links to your website increase its authority
- AI training data gives more weight to linked content
Tracking Contamination Over Time
Navigate to Brand Intelligence → Contamination Risk for:
- Contamination Timeline: Tracks each false claim’s spread
- Trend Chart: Shows prevalence over time (growing/stable/declining)
- Per-Engine Breakdown: Which engines have the false claim
- Source Analysis: Where the false claim originated
Example Timeline:
Claim: "AI Visibility Score affects search rankings"[This is FALSE; GEO measures AI discovery, not SEO]
Timeline:Jan 15: ChatGPT mentions this (1 engine, 20% risk)Jan 20: Perplexity mentions it (2 engines, 40% risk)Jan 25: Claude mentions it (3 engines, 60% risk) ← HIGH RISK ALERTJan 30: Gemini mentions it (4 engines, 80% risk) ← CRITICAL ALERTFeb 5: Still 4 engines (stable, 80% risk) → Plateau
Action Timeline:Jan 20: You published blog clarifying the differenceJan 25: Contacted sources that quote the false claimFeb 1: Created FAQ "GEO vs SEO: What's the Difference?"Feb 5: Noticed growth has stoppedFeb 15: Gemini updated and dropped the false claim (back to 3 engines)Mar 1: All engines corrected (back to 20% or lower)Managing Critical Contamination
If a claim reaches CRITICAL risk (80%+):
Immediate Actions (Today)
- Create official correction on your homepage
- Send media inquiry to major sources citing the false claim
- Post on social media with correction
- Notify your team to mention correction in customer conversations
Short-term (This Week)
- Publish detailed blog post explaining correct info
- Update all website pages that might perpetuate the false claim
- Create FAQ addressing the false claim directly
- Monitor daily for further spread
Medium-term (This Month)
- Reach out to Wikipedia (if listed) to correct entry
- Contact major publications that cited false claim
- Build PR campaign around correct narrative
- Implement schema markup on your website to help AI systems understand your facts
Case Studies
Case 1: Prevented Contamination (Quick Action)
False Claim: "TruthVouch charges per API call"Actual: "TruthVouch charges per month, flat rate"
Week 1: ChatGPT mentions pay-per-call pricingAction: You immediately publish "Our Pricing Model" blog postWeek 2: Claude picks it up, but your blog post appears at top of searchResult: Other engines cite your blog instead of ChatGPTOutcome: Contamination stopped at 2 engines before spreading furtherCase 2: Late Intervention (Harder Fix)
False Claim: "TruthVouch is expensive"Actual: Competitive pricing vs. alternatives
Week 1: ChatGPT mentions pricing as highWeek 2: Claude and Gemini pick it upWeek 3: Perplexity mentions it (now 4/5 engines)Week 4: You create "ROI Calculator" tool
Result: Your ROI content helps, but claims already entrenchedTimeline to full recovery: 3-4 months (waiting for model updates)Lesson: Act at week 1, not week 4Tools for Contamination Management
In TruthVouch dashboard:
Contamination Monitoring
- Automated tracking of claim spread
- Alerts when claims reach medium/high risk
- Per-engine tracking
- Source attribution
Recommended Actions
- Auto-generated content suggestions
- FAQ templates addressing false claims
- Blog post outlines
- Counter-messaging frameworks
Risk Forecasting
- Trend projection: “At current rate, will reach 5 engines in X weeks”
- Growth rate monitoring
- Early warnings
Next Steps
- Claim Extraction → — Understand where claims come from
- Narrative Clustering → — How claims form narratives
- Dashboard Alerts → — Monitor contamination in real-time
- Accuracy Score → — Understand overall accuracy impact