Auto-Correction Pipeline

The Auto-Correction Pipeline automatically generates, deploys, and verifies corrected responses for hallucinations. The system uses AI-powered methods to create optimal corrections while maintaining full audit trails. Approvals can be automatic or manual depending on your confidence level and risk tolerance.

How Auto-Correction Works

Step 1: Detection

Shield detects hallucination (AI response contradicts Truth Nugget).

Detection method:

Natural Language Inference (NLI) model compares AI response to fact
Confidence threshold: ≥60% (true positive)
Cross-check against multiple AI engines to increase confidence

Step 2: Correction Generation

The system automatically generates corrected AI response using your Truth Nugget as reference.

Three automated methods:

Method A: Neural Fact Sheet (Recommended)

AI automatically creates optimized summary of correct fact
Embedded in knowledge base / RAG context
Future queries against AI engines see correct fact
System learns from your fact nugget and improves AI behavior over time

Method B: Direct Response Correction

Direct replacement text (what AI should have said instead)
Replaces hallucination with approved text
Some LLM APIs support this; not all providers allow it

Method C: Prompt Engineering

Refine the prompt/system message to reduce hallucinations
Added context or instructions to prevent error
Slower to deploy; affects all future requests

Default: Method A (Neural Fact Sheet) — Best balance of effectiveness and ease.

Step 3: Approval

Correction awaits approval based on severity and confidence:

Auto-Approve (if configured):

Low-severity alerts + high confidence (>85%) → Auto-approved
Low-risk updates (product name, founding date) → Auto-approved
No human review (fast); system logs auto-approval for audit

Manual Approve (default):

High-severity or lower confidence → Requires human review
Fact owner reviews correction quality and accuracy
Click “Approve” or “Request Changes”
If changes requested → Regenerate correction

Approval time: 15 minutes (median)

Step 4: Deployment

Approved correction is automatically deployed to AI monitoring systems.

Automated deployment:

Neural Fact Sheets deployed to vector database / knowledge base
Embedded in RAG pipeline
Accessible to all monitored AI engines (OpenAI, Anthropic, Google, etc.)
Takes 15-30 seconds to deploy with zero downtime

Automatic verification:

Shield automatically re-queries the AI engine 24-72 hours later
Tests if AI now gives correct answer
No manual follow-up required

Step 5: Verification

Shield re-checks if hallucination is fixed:

Verification process:

Same AI engine queried with same/similar prompt
Response compared to Truth Nugget
Confidence measured: is correction working?

Outcomes:

Verified (95% of cases): AI now correct; hallucination fixed
Partially Verified: Correct on some variations, not all
Unverified: Still hallucinating; correction ineffective

Verification time: 24-72 hours (batch-processed nightly)

Step 6: Resolution

Alert marked resolved once verification complete:

Status:

Resolved - Corrected & Verified — Success
Escalated - Correction Ineffective — Manual investigation needed
Resolved - Dismissed — False positive (if dismissed instead of corrected)

Approval Workflows

Automatic Approval

Configuration:

Go to Settings → Correction Approval and define auto-approval rules:

Rule 1: Auto-approve Low-severity hallucinations
IF: Severity = Low
  AND Confidence ≥ 80%
  AND Fact category NOT "Financial" or "Compliance"
THEN: Auto-approve without review

Rule 2: Auto-approve minor corrections
IF: Confidence ≥ 90%
  AND Change is minor (typo, formatting, name variation)
THEN: Auto-approve

Rule 3: Require approval for high-risk items
IF: Severity = Critical
  OR Fact category = "Financial" or "Compliance"
  OR Confidence < 70%
THEN: Require manual approval

Benefits:

Faster time-to-correction (no wait for review)
Reduces manual workload for low-risk items
Maintains audit trail (auto-approvals logged with reason)

Risk:

Occasional approval of false positives (rare; <1% with high confidence threshold)
Mitigation: System learns from corrections; can roll back if ineffective

Manual Approval

Process:

Alert created; correction suggested
Fact owner notified (email/Slack)
Owner clicks alert in dashboard
Reviews:
- What AI said (hallucination)
- Truth Nugget (correct fact)
- Suggested Neural Fact Sheet
- Confidence score and evidence
Approves, requests changes, or dismisses
If approved → Correction deploys

Approval options:

Approve as-is — Accept suggested correction
Request Changes — Suggest different wording; system regenerates
Dismiss — False positive; mark alert resolved without correction
Escalate — Assign to someone else for decision

Approval time (SLA): Varies by severity

Critical: 15 min response SLA
High: 1 hour response SLA
Medium: 4 hour response SLA
Low: 24 hour response SLA

Neural Fact Sheets

A Neural Fact Sheet is an AI-friendly summary of your fact, embedded in the AI’s context.

Structure

FACT: Founding Date
STATEMENT: TruthVouch was founded in 2021
CONTEXT: Founded by Eyal Chen and team at Stanford
SOURCE: Company website, Inc.com profile
CONFIDENCE: High (from official sources)
EXAMPLES:
  - "In 2021, TruthVouch was founded..."
  - "Founded on March 15, 2021..."
  - "TruthVouch's founding year is 2021..."

How It Works

Embedding: Neural Fact Sheet converted to vector embedding (via embeddings API)
Storage: Stored in vector database with all other fact sheets
Retrieval: When AI engine is queried, relevant facts retrieved by semantic search
Context: Facts injected into AI’s prompt as context (“Remember that TruthVouch was founded in 2021…”)
Inference: AI uses fact sheet context to generate correct response

Quality

Higher-quality fact sheets = better AI correction:

Effective fact sheet:

Clear, concise statement (1-2 sentences)
Context explaining WHY (helps AI understand nuance)
Examples of how fact might be phrased
Sourced (where you learned this fact)
Confidence level (High/Medium/Low)

Example:

FACT: Pricing
STATEMENT: TruthVouch's Standard plan costs $500/month
CONTEXT: Price includes up to 5M cross-checks and 3 Truth Nuggets
SOURCE: Pricing page (pricing.truthvouch.com), current as of 2024-Q1
CONFIDENCE: High
EXAMPLES:
  - "$500 per month for Standard plan"
  - "Standard pricing is $500/month with 5M checks"
  - "Enterprise customers pay custom pricing above Standard tier"

Bulk Corrections

Correct multiple hallucinations efficiently:

Scenario: Same Fact Hallucinated 5 Times

Instead of 5 separate corrections, approve once:

Select multiple related alerts (checkbox)
Click Bulk Actions → Apply Correction
Review suggested correction (same for all)
Approve once
Deployed to all monitored AI engines at once

Time saved: Instead of 5 × 15 min = 75 min, now 15 min total

Scenario: Fact Update Affects Multiple Hallucinations

When you update a Truth Nugget:

Edit Nugget → New text saved
System finds all recent hallucinations related to this fact
Suggests correction for all (based on new nugget)
Bulk approve option

Example: Update “employee count” from “150” to “200”

Find all alerts where AI said incorrect employee count
Suggest new fact sheet with updated count
One approval fixes all related hallucinations

Rollback Mechanisms

If correction causes problems (correction itself is inaccurate):

Automatic Rollback Triggers

System can auto-rollback in edge cases:

IF: Verification shows correction made things worse
  AND Accuracy decreased by >5 percentage points
THEN: Roll back correction; revert to previous fact sheet

Manual Rollback

If correction approved and deployed but later found to be wrong:

Go to Corrections History
Find correction
Click Rollback
Previous version restored
New alert created to correct the correction

Audit trail: All rollbacks logged with reason and approver

Monitoring Correction Effectiveness

Dashboard Metrics

Correction Status:

Pending Approval: X alerts
Deployed: Y alerts
Verified: Z alerts
Ineffective (requires escalation): W alerts

Verification Rate:

% of deployed corrections that verified successfully
Target: ≥90% verification rate
Below target → Investigate why corrections not working

Time-to-Correction:

Average time from detection to verified correction
Target: <24 hours for High/Critical

Bulk Action Efficiency:

Avg hallucinations fixed per bulk correction
Higher = more efficiency

Alerts When Corrections Fail

Ineffective correction alert:

Correction Ineffective: "TruthVouch founding date"
- Approved & deployed 3/15
- Verification run 3/17 shows AI still says wrong date
- Possible reasons:
  - Fact sheet wording unclear
  - AI engine uses different knowledge source
  - Prompt injection/adversarial
- Recommended action: Manual review and regeneration

Neural Fact Sheets — Deep dive on fact sheet structure
Correction History — Track all corrections and audit trail
Overview — Corrections pipeline overview

Next Steps

Review auto-approval settings — Which corrections are safe to auto-approve?
Create approval rules — Define thresholds for automatic vs. manual
Train your team — How to review and approve corrections
Monitor effectiveness — Track correction verification rates
Iterate — Improve fact sheets based on verification feedback