Auto-Correction Pipeline
The Auto-Correction Pipeline automatically generates, deploys, and verifies corrected responses for hallucinations. The system uses AI-powered methods to create optimal corrections while maintaining full audit trails. Approvals can be automatic or manual depending on your confidence level and risk tolerance.
How Auto-Correction Works
Step 1: Detection
Shield detects hallucination (AI response contradicts Truth Nugget).
Detection method:
- Natural Language Inference (NLI) model compares AI response to fact
- Confidence threshold: ≥60% (true positive)
- Cross-check against multiple AI engines to increase confidence
Step 2: Correction Generation
The system automatically generates corrected AI response using your Truth Nugget as reference.
Three automated methods:
Method A: Neural Fact Sheet (Recommended)
- AI automatically creates optimized summary of correct fact
- Embedded in knowledge base / RAG context
- Future queries against AI engines see correct fact
- System learns from your fact nugget and improves AI behavior over time
Method B: Direct Response Correction
- Direct replacement text (what AI should have said instead)
- Replaces hallucination with approved text
- Some LLM APIs support this; not all providers allow it
Method C: Prompt Engineering
- Refine the prompt/system message to reduce hallucinations
- Added context or instructions to prevent error
- Slower to deploy; affects all future requests
Default: Method A (Neural Fact Sheet) — Best balance of effectiveness and ease.
Step 3: Approval
Correction awaits approval based on severity and confidence:
Auto-Approve (if configured):
- Low-severity alerts + high confidence (>85%) → Auto-approved
- Low-risk updates (product name, founding date) → Auto-approved
- No human review (fast); system logs auto-approval for audit
Manual Approve (default):
- High-severity or lower confidence → Requires human review
- Fact owner reviews correction quality and accuracy
- Click “Approve” or “Request Changes”
- If changes requested → Regenerate correction
Approval time: 15 minutes (median)
Step 4: Deployment
Approved correction is automatically deployed to AI monitoring systems.
Automated deployment:
- Neural Fact Sheets deployed to vector database / knowledge base
- Embedded in RAG pipeline
- Accessible to all monitored AI engines (OpenAI, Anthropic, Google, etc.)
- Takes 15-30 seconds to deploy with zero downtime
Automatic verification:
- Shield automatically re-queries the AI engine 24-72 hours later
- Tests if AI now gives correct answer
- No manual follow-up required
Step 5: Verification
Shield re-checks if hallucination is fixed:
Verification process:
- Same AI engine queried with same/similar prompt
- Response compared to Truth Nugget
- Confidence measured: is correction working?
Outcomes:
- Verified (95% of cases): AI now correct; hallucination fixed
- Partially Verified: Correct on some variations, not all
- Unverified: Still hallucinating; correction ineffective
Verification time: 24-72 hours (batch-processed nightly)
Step 6: Resolution
Alert marked resolved once verification complete:
Status:
Resolved - Corrected & Verified— SuccessEscalated - Correction Ineffective— Manual investigation neededResolved - Dismissed— False positive (if dismissed instead of corrected)
Approval Workflows
Automatic Approval
Configuration:
Go to Settings → Correction Approval and define auto-approval rules:
Rule 1: Auto-approve Low-severity hallucinationsIF: Severity = Low AND Confidence ≥ 80% AND Fact category NOT "Financial" or "Compliance"THEN: Auto-approve without review
Rule 2: Auto-approve minor correctionsIF: Confidence ≥ 90% AND Change is minor (typo, formatting, name variation)THEN: Auto-approve
Rule 3: Require approval for high-risk itemsIF: Severity = Critical OR Fact category = "Financial" or "Compliance" OR Confidence < 70%THEN: Require manual approvalBenefits:
- Faster time-to-correction (no wait for review)
- Reduces manual workload for low-risk items
- Maintains audit trail (auto-approvals logged with reason)
Risk:
- Occasional approval of false positives (rare; <1% with high confidence threshold)
- Mitigation: System learns from corrections; can roll back if ineffective
Manual Approval
Process:
- Alert created; correction suggested
- Fact owner notified (email/Slack)
- Owner clicks alert in dashboard
- Reviews:
- What AI said (hallucination)
- Truth Nugget (correct fact)
- Suggested Neural Fact Sheet
- Confidence score and evidence
- Approves, requests changes, or dismisses
- If approved → Correction deploys
Approval options:
- Approve as-is — Accept suggested correction
- Request Changes — Suggest different wording; system regenerates
- Dismiss — False positive; mark alert resolved without correction
- Escalate — Assign to someone else for decision
Approval time (SLA): Varies by severity
- Critical: 15 min response SLA
- High: 1 hour response SLA
- Medium: 4 hour response SLA
- Low: 24 hour response SLA
Neural Fact Sheets
A Neural Fact Sheet is an AI-friendly summary of your fact, embedded in the AI’s context.
Structure
FACT: Founding DateSTATEMENT: TruthVouch was founded in 2021CONTEXT: Founded by Eyal Chen and team at StanfordSOURCE: Company website, Inc.com profileCONFIDENCE: High (from official sources)EXAMPLES: - "In 2021, TruthVouch was founded..." - "Founded on March 15, 2021..." - "TruthVouch's founding year is 2021..."How It Works
- Embedding: Neural Fact Sheet converted to vector embedding (via embeddings API)
- Storage: Stored in vector database with all other fact sheets
- Retrieval: When AI engine is queried, relevant facts retrieved by semantic search
- Context: Facts injected into AI’s prompt as context (“Remember that TruthVouch was founded in 2021…”)
- Inference: AI uses fact sheet context to generate correct response
Quality
Higher-quality fact sheets = better AI correction:
Effective fact sheet:
- Clear, concise statement (1-2 sentences)
- Context explaining WHY (helps AI understand nuance)
- Examples of how fact might be phrased
- Sourced (where you learned this fact)
- Confidence level (High/Medium/Low)
Example:
FACT: PricingSTATEMENT: TruthVouch's Standard plan costs $500/monthCONTEXT: Price includes up to 5M cross-checks and 3 Truth NuggetsSOURCE: Pricing page (pricing.truthvouch.com), current as of 2024-Q1CONFIDENCE: HighEXAMPLES: - "$500 per month for Standard plan" - "Standard pricing is $500/month with 5M checks" - "Enterprise customers pay custom pricing above Standard tier"Bulk Corrections
Correct multiple hallucinations efficiently:
Scenario: Same Fact Hallucinated 5 Times
Instead of 5 separate corrections, approve once:
- Select multiple related alerts (checkbox)
- Click Bulk Actions → Apply Correction
- Review suggested correction (same for all)
- Approve once
- Deployed to all monitored AI engines at once
Time saved: Instead of 5 × 15 min = 75 min, now 15 min total
Scenario: Fact Update Affects Multiple Hallucinations
When you update a Truth Nugget:
- Edit Nugget → New text saved
- System finds all recent hallucinations related to this fact
- Suggests correction for all (based on new nugget)
- Bulk approve option
Example: Update “employee count” from “150” to “200”
- Find all alerts where AI said incorrect employee count
- Suggest new fact sheet with updated count
- One approval fixes all related hallucinations
Rollback Mechanisms
If correction causes problems (correction itself is inaccurate):
Automatic Rollback Triggers
System can auto-rollback in edge cases:
IF: Verification shows correction made things worse AND Accuracy decreased by >5 percentage pointsTHEN: Roll back correction; revert to previous fact sheetManual Rollback
If correction approved and deployed but later found to be wrong:
- Go to Corrections History
- Find correction
- Click Rollback
- Previous version restored
- New alert created to correct the correction
Audit trail: All rollbacks logged with reason and approver
Monitoring Correction Effectiveness
Dashboard Metrics
Correction Status:
- Pending Approval: X alerts
- Deployed: Y alerts
- Verified: Z alerts
- Ineffective (requires escalation): W alerts
Verification Rate:
- % of deployed corrections that verified successfully
- Target: ≥90% verification rate
- Below target → Investigate why corrections not working
Time-to-Correction:
- Average time from detection to verified correction
- Target: <24 hours for High/Critical
Bulk Action Efficiency:
- Avg hallucinations fixed per bulk correction
- Higher = more efficiency
Alerts When Corrections Fail
Ineffective correction alert:
Correction Ineffective: "TruthVouch founding date"- Approved & deployed 3/15- Verification run 3/17 shows AI still says wrong date- Possible reasons: - Fact sheet wording unclear - AI engine uses different knowledge source - Prompt injection/adversarial- Recommended action: Manual review and regenerationRelated Topics
- Neural Fact Sheets — Deep dive on fact sheet structure
- Correction History — Track all corrections and audit trail
- Overview — Corrections pipeline overview
Next Steps
- Review auto-approval settings — Which corrections are safe to auto-approve?
- Create approval rules — Define thresholds for automatic vs. manual
- Train your team — How to review and approve corrections
- Monitor effectiveness — Track correction verification rates
- Iterate — Improve fact sheets based on verification feedback