Skip to content

Auto-Correction Pipeline

The Auto-Correction Pipeline automatically generates, deploys, and verifies corrected responses for hallucinations. The system uses AI-powered methods to create optimal corrections while maintaining full audit trails. Approvals can be automatic or manual depending on your confidence level and risk tolerance.

How Auto-Correction Works

Step 1: Detection

Shield detects hallucination (AI response contradicts Truth Nugget).

Detection method:

  • Natural Language Inference (NLI) model compares AI response to fact
  • Confidence threshold: ≥60% (true positive)
  • Cross-check against multiple AI engines to increase confidence

Step 2: Correction Generation

The system automatically generates corrected AI response using your Truth Nugget as reference.

Three automated methods:

Method A: Neural Fact Sheet (Recommended)

  • AI automatically creates optimized summary of correct fact
  • Embedded in knowledge base / RAG context
  • Future queries against AI engines see correct fact
  • System learns from your fact nugget and improves AI behavior over time

Method B: Direct Response Correction

  • Direct replacement text (what AI should have said instead)
  • Replaces hallucination with approved text
  • Some LLM APIs support this; not all providers allow it

Method C: Prompt Engineering

  • Refine the prompt/system message to reduce hallucinations
  • Added context or instructions to prevent error
  • Slower to deploy; affects all future requests

Default: Method A (Neural Fact Sheet) — Best balance of effectiveness and ease.

Step 3: Approval

Correction awaits approval based on severity and confidence:

Auto-Approve (if configured):

  • Low-severity alerts + high confidence (>85%) → Auto-approved
  • Low-risk updates (product name, founding date) → Auto-approved
  • No human review (fast); system logs auto-approval for audit

Manual Approve (default):

  • High-severity or lower confidence → Requires human review
  • Fact owner reviews correction quality and accuracy
  • Click “Approve” or “Request Changes”
  • If changes requested → Regenerate correction

Approval time: 15 minutes (median)

Step 4: Deployment

Approved correction is automatically deployed to AI monitoring systems.

Automated deployment:

  • Neural Fact Sheets deployed to vector database / knowledge base
  • Embedded in RAG pipeline
  • Accessible to all monitored AI engines (OpenAI, Anthropic, Google, etc.)
  • Takes 15-30 seconds to deploy with zero downtime

Automatic verification:

  • Shield automatically re-queries the AI engine 24-72 hours later
  • Tests if AI now gives correct answer
  • No manual follow-up required

Step 5: Verification

Shield re-checks if hallucination is fixed:

Verification process:

  • Same AI engine queried with same/similar prompt
  • Response compared to Truth Nugget
  • Confidence measured: is correction working?

Outcomes:

  • Verified (95% of cases): AI now correct; hallucination fixed
  • Partially Verified: Correct on some variations, not all
  • Unverified: Still hallucinating; correction ineffective

Verification time: 24-72 hours (batch-processed nightly)

Step 6: Resolution

Alert marked resolved once verification complete:

Status:

  • Resolved - Corrected & Verified — Success
  • Escalated - Correction Ineffective — Manual investigation needed
  • Resolved - Dismissed — False positive (if dismissed instead of corrected)

Approval Workflows

Automatic Approval

Configuration:

Go to SettingsCorrection Approval and define auto-approval rules:

Rule 1: Auto-approve Low-severity hallucinations
IF: Severity = Low
AND Confidence ≥ 80%
AND Fact category NOT "Financial" or "Compliance"
THEN: Auto-approve without review
Rule 2: Auto-approve minor corrections
IF: Confidence ≥ 90%
AND Change is minor (typo, formatting, name variation)
THEN: Auto-approve
Rule 3: Require approval for high-risk items
IF: Severity = Critical
OR Fact category = "Financial" or "Compliance"
OR Confidence < 70%
THEN: Require manual approval

Benefits:

  • Faster time-to-correction (no wait for review)
  • Reduces manual workload for low-risk items
  • Maintains audit trail (auto-approvals logged with reason)

Risk:

  • Occasional approval of false positives (rare; <1% with high confidence threshold)
  • Mitigation: System learns from corrections; can roll back if ineffective

Manual Approval

Process:

  1. Alert created; correction suggested
  2. Fact owner notified (email/Slack)
  3. Owner clicks alert in dashboard
  4. Reviews:
    • What AI said (hallucination)
    • Truth Nugget (correct fact)
    • Suggested Neural Fact Sheet
    • Confidence score and evidence
  5. Approves, requests changes, or dismisses
  6. If approved → Correction deploys

Approval options:

  • Approve as-is — Accept suggested correction
  • Request Changes — Suggest different wording; system regenerates
  • Dismiss — False positive; mark alert resolved without correction
  • Escalate — Assign to someone else for decision

Approval time (SLA): Varies by severity

  • Critical: 15 min response SLA
  • High: 1 hour response SLA
  • Medium: 4 hour response SLA
  • Low: 24 hour response SLA

Neural Fact Sheets

A Neural Fact Sheet is an AI-friendly summary of your fact, embedded in the AI’s context.

Structure

FACT: Founding Date
STATEMENT: TruthVouch was founded in 2021
CONTEXT: Founded by Eyal Chen and team at Stanford
SOURCE: Company website, Inc.com profile
CONFIDENCE: High (from official sources)
EXAMPLES:
- "In 2021, TruthVouch was founded..."
- "Founded on March 15, 2021..."
- "TruthVouch's founding year is 2021..."

How It Works

  1. Embedding: Neural Fact Sheet converted to vector embedding (via embeddings API)
  2. Storage: Stored in vector database with all other fact sheets
  3. Retrieval: When AI engine is queried, relevant facts retrieved by semantic search
  4. Context: Facts injected into AI’s prompt as context (“Remember that TruthVouch was founded in 2021…”)
  5. Inference: AI uses fact sheet context to generate correct response

Quality

Higher-quality fact sheets = better AI correction:

Effective fact sheet:

  • Clear, concise statement (1-2 sentences)
  • Context explaining WHY (helps AI understand nuance)
  • Examples of how fact might be phrased
  • Sourced (where you learned this fact)
  • Confidence level (High/Medium/Low)

Example:

FACT: Pricing
STATEMENT: TruthVouch's Standard plan costs $500/month
CONTEXT: Price includes up to 5M cross-checks and 3 Truth Nuggets
SOURCE: Pricing page (pricing.truthvouch.com), current as of 2024-Q1
CONFIDENCE: High
EXAMPLES:
- "$500 per month for Standard plan"
- "Standard pricing is $500/month with 5M checks"
- "Enterprise customers pay custom pricing above Standard tier"

Bulk Corrections

Correct multiple hallucinations efficiently:

Scenario: Same Fact Hallucinated 5 Times

Instead of 5 separate corrections, approve once:

  1. Select multiple related alerts (checkbox)
  2. Click Bulk ActionsApply Correction
  3. Review suggested correction (same for all)
  4. Approve once
  5. Deployed to all monitored AI engines at once

Time saved: Instead of 5 × 15 min = 75 min, now 15 min total

Scenario: Fact Update Affects Multiple Hallucinations

When you update a Truth Nugget:

  1. Edit Nugget → New text saved
  2. System finds all recent hallucinations related to this fact
  3. Suggests correction for all (based on new nugget)
  4. Bulk approve option

Example: Update “employee count” from “150” to “200”

  • Find all alerts where AI said incorrect employee count
  • Suggest new fact sheet with updated count
  • One approval fixes all related hallucinations

Rollback Mechanisms

If correction causes problems (correction itself is inaccurate):

Automatic Rollback Triggers

System can auto-rollback in edge cases:

IF: Verification shows correction made things worse
AND Accuracy decreased by >5 percentage points
THEN: Roll back correction; revert to previous fact sheet

Manual Rollback

If correction approved and deployed but later found to be wrong:

  1. Go to Corrections History
  2. Find correction
  3. Click Rollback
  4. Previous version restored
  5. New alert created to correct the correction

Audit trail: All rollbacks logged with reason and approver

Monitoring Correction Effectiveness

Dashboard Metrics

Correction Status:

  • Pending Approval: X alerts
  • Deployed: Y alerts
  • Verified: Z alerts
  • Ineffective (requires escalation): W alerts

Verification Rate:

  • % of deployed corrections that verified successfully
  • Target: ≥90% verification rate
  • Below target → Investigate why corrections not working

Time-to-Correction:

  • Average time from detection to verified correction
  • Target: <24 hours for High/Critical

Bulk Action Efficiency:

  • Avg hallucinations fixed per bulk correction
  • Higher = more efficiency

Alerts When Corrections Fail

Ineffective correction alert:

Correction Ineffective: "TruthVouch founding date"
- Approved & deployed 3/15
- Verification run 3/17 shows AI still says wrong date
- Possible reasons:
- Fact sheet wording unclear
- AI engine uses different knowledge source
- Prompt injection/adversarial
- Recommended action: Manual review and regeneration

Next Steps

  1. Review auto-approval settings — Which corrections are safe to auto-approve?
  2. Create approval rules — Define thresholds for automatic vs. manual
  3. Train your team — How to review and approve corrections
  4. Monitor effectiveness — Track correction verification rates
  5. Iterate — Improve fact sheets based on verification feedback