Skip to content

How Corrections Work

Corrections automatically fix hallucinations by deploying accurate information to AI systems. Learn the full pipeline from detection through verification, correction methods, approval workflows, and best practices.

Correction history showing deployed fixes and verification status

The Correction Pipeline

Overview

When Shield detects a hallucination, the correction pipeline follows these steps:

DETECT → GENERATE → APPROVE → DEPLOY → VERIFY → RESOLVE

Timeline: Typically 4-24 hours from detection to resolved (faster if auto-approved)

Step-by-Step

1. Detect Hallucination

Shield identifies AI response contradicts your Truth Nugget:

  • Natural Language Inference (NLI) detects contradiction
  • Confidence threshold ≥60% (moderate to high confidence)
  • Optionally cross-check against multiple AI engines to increase confidence

Status: Detected

2. Generate Correction

System generates corrected information using your Truth Nugget:

Method A: Neural Fact Sheet (Recommended)

  • Creates AI-optimized summary of correct fact
  • Includes context, examples, sources
  • Deployed to vector DB for future queries to use as context
  • Teaches AI the right answer

Method B: Direct Response

  • Direct replacement text for the hallucination
  • Some LLM APIs support this; many don’t
  • Used alongside Neural Fact Sheet

Method C: Prompt Engineering

  • Refines system prompt to prevent error
  • Adds guardrails or instructions to prevent hallucination
  • Slower to deploy; affects all future requests

Default: Method A (Neural Fact Sheet)

Status: Correction Generated

3. Approve Correction

Correction awaits approval based on configuration:

Auto-approval (if configured):

  • Low-severity alerts + high confidence → Automatically approved
  • No human review needed (fast)
  • Logged for audit trail

Manual approval (default):

  • Fact owner reviews and approves correction quality
  • Approves, requests changes, or dismisses
  • Response SLA varies by severity (Critical: 15min, High: 1hr, Medium: 4hr, Low: 24hr)

Status: Pending ApprovalApproved or Dismissed

4. Deploy Correction

Approved correction deployed to production:

Deployment target:

  • Neural Fact Sheets → Vector DB / knowledge base
  • Accessible to all monitored AI engines (OpenAI, Anthropic, Google, etc.)
  • Injected into AI’s context on future queries

Deployment time: 15-30 seconds (typically)

Status: DeployingDeployed

5. Verify Effectiveness

Shield re-checks if correction worked:

Verification runs automatically:

  • 24-72 hours after deployment
  • Same AI engine re-queried with same/similar prompt
  • Response compared to Truth Nugget
  • Confidence measured: is hallucination fixed?

Outcomes:

  • Verified (95% of cases): AI now correct
  • Partially Verified: Correct on some variations; might need refinement
  • Unverified: Still hallucinating; requires investigation

Status: VerifyingVerified or Escalated

6. Resolve

Alert marked resolved with final status:

Possible final statuses:

  • Resolved - Corrected & Verified — Success
  • Resolved - Dismissed — False positive
  • Resolved - Nugget Updated — Our fact was wrong; we fixed it
  • Escalated — Correction ineffective; manual investigation needed

Status: Resolved or Escalated

Correction Methods

Method A: Neural Fact Sheet

What it is: AI-friendly summary of a fact, embedded in LLM context

How it works:

  1. Fact sheet created from Truth Nugget (statement, context, examples, source)
  2. Converted to vector embedding
  3. Stored in vector database
  4. On future queries, relevant facts retrieved and injected into AI’s prompt
  5. AI uses fact sheet context to generate correct answer

Advantages:

  • Works across all LLM providers
  • Learns from your knowledge (improves over time)
  • Can handle nuance and context
  • Natural for AI to understand

Limitations:

  • Takes 24-72 hours to verify
  • Depends on fact sheet quality (poorly written sheets = worse results)
  • LLM still might ignore facts (unlikely but possible)

Best for: Long-term, sustainable corrections to knowledge

Example:

Your Truth Nugget: "TruthVouch was founded in 2021"
Generated Fact Sheet:
STATEMENT: TruthVouch was founded in 2021
CONTEXT: Founded by Eyal Chen and team at Stanford
EXAMPLES:
- "TruthVouch was founded in 2021"
- "In 2021, TruthVouch was established"
SOURCE: Company website, Inc.com profile
Deployment: Fact sheet added to vector DB
Future query: AI asked "When was TruthVouch founded?"
Result: AI retrieves fact sheet, responds correctly: "TruthVouch was founded in 2021"

Method B: Direct Correction (API-dependent)

What it is: Direct instruction to AI to say something specific

How it works:

  1. Send correction directly to AI API (if supported)
  2. API applies correction to next response
  3. Immediate effect (no waiting for verification)

Advantages:

  • Immediate deployment
  • Works for non-repeating statements
  • No fact sheet quality issues

Limitations:

  • Only works if LLM API supports it
  • Not all providers support this
  • Doesn’t help with similar future queries
  • Requires API integration

Best for: Quick fixes to one-off responses (rarely used)

Note: OpenAI, Anthropic, Google support different correction mechanisms. TruthVouch uses Method A (Neural Fact Sheet) by default as it’s most reliable across providers.

Method C: Prompt Engineering

What it is: Refinement of system prompt or context to prevent error

How it works:

  1. Analyze why hallucination occurred
  2. Add guardrails to system prompt
  3. Redeploy system prompt to monitoring system
  4. Future queries use refined prompt

Advantages:

  • Targets root cause
  • Can prevent entire classes of errors
  • Works across all AI interactions

Limitations:

  • Requires prompt engineering expertise
  • Slower to test and verify
  • May over-constrain AI (reduce creativity/capability)
  • Not suitable for all hallucinations

Best for: Systemic issues (same type of error repeating)

Example:

Hallucination: AI invents product features
Root cause: Prompt doesn't mention "only discuss features from this list"
Fix: Add to system prompt: "Only discuss features listed in [knowledge base].
If a feature isn't listed, say 'I'm not familiar with that feature.'"
Result: AI stops inventing features

Correction Workflows

Standard Workflow

1. Alert Created (hallucination detected)
2. Assign to fact owner
3. Owner reviews (15 min)
4. Owner approves → Neural Fact Sheet deployed (30 sec)
5. System verifies (24-72 hours)
6. Alert resolved (verified successful)

Total time: 24-72 hours

Expedited Workflow (Critical Alerts)

1. Alert Created + immediately escalate
2. CEO/CTO notified via PagerDuty
3. Fact owner reviews (5 min)
4. Approval (2 min)
5. Deployed (15 sec)
6. Verification initiated (24-48 hours)
7. Resolved

Total time: < 10 minutes to deploy; 24-48 hours to verify

Auto-Approval Workflow (Low-Risk Items)

1. Alert Created
2. Confidence ≥80% + Severity Low?
3. Auto-approved (no human review)
4. Deployed (30 sec)
5. Verification (24-72 hours)
6. Resolved

Total time: 24-72 hours (no human delay)

Best Practices

1. Prioritize Corrections

Not all corrections are equal. Prioritize by:

  • Impact: High-impact hallucinations (financial, brand, compliance) first
  • Confidence: High-confidence corrections more likely to succeed
  • Repeatability: Recurring hallucinations (same fact, multiple engines) get more ROI

Example priority:

  1. Critical hallucinations (High impact + High confidence)
  2. High-volume hallucinations (low impact but occurring frequently)
  3. Low-risk corrections (can auto-approve to save time)
  4. Low-priority items (batch and handle weekly)

2. Create High-Quality Fact Sheets

Quality matters:

  • Clear statement (unambiguous)
  • Rich context (explains nuance)
  • Good examples (4+ variations)
  • Credible sources (official, with dates)

See Neural Fact Sheets for detailed guidance.

3. Monitor Verification Rates

Track how many corrections actually work:

  • Target: ≥90% verification rate
  • Below target: Investigate why (fact sheet quality? AI limitations?)
  • Improve fact sheets based on verification feedback

When same fact hallucinated multiple times:

  • Approve once, deploy to all
  • Saves time and effort
  • Single fact sheet fixes multiple alerts

5. Keep Fact Sheets Current

Outdated fact sheets cause new hallucinations:

  • Update quarterly (at minimum)
  • Update immediately for time-sensitive facts (pricing, team, locations)
  • Archive old versions
  • Version-track all fact sheets

6. Learn from Ineffective Corrections

When verification fails:

  • Review fact sheet: Is wording clear?
  • Check AI engine documentation: Does it support RAG/context?
  • Try different wording: Sometimes LLM is sensitive to phrasing
  • Consider alternative method (prompt engineering vs. fact sheet)

Next Steps

  1. Review recent corrections — What corrections have you deployed?
  2. Check verification rates — Are corrections working? (target: >90%)
  3. Improve fact sheets — For low-performing corrections
  4. Set up auto-approval — For low-risk items
  5. Monitor metrics — Track correction effectiveness monthly