Skip to content

Fact-Check Your RAG Pipeline

You built a RAG (Retrieval Augmented Generation) pipeline to answer questions about your organization. You inject your knowledge base into prompts, the LLM answers, and you hope the response is accurate. But what if the LLM hallucinates? What if it invents facts not in your knowledge base? This guide shows you how to integrate fact-checking into your RAG pipeline to guarantee accuracy before responses reach users.

Overview

RAG pipelines combine retrieval (finding relevant docs) + generation (LLM answering). But LLMs can still hallucinate even with context. TruthVouch adds a fact-checking layer:

User Question
Retrieve context from knowledge base
LLM generates answer with context
TruthVouch verifies claims in answer ← NEW
If accurate: Return response
If hallucination: Flag/rewrite/block

This ensures users only see verified, accurate information.

Prerequisites

  1. TruthVouch account with Business tier or higher (Trust API included)
  2. Existing RAG pipeline (LLM + knowledge base)
  3. Truth Nuggets created (10+ facts about your organization)
  4. SDK installed for your language (Python, TypeScript, C#)

Step 1: Set Up the Trust API

The Trust API is TruthVouch’s programmatic interface for fact-checking.

  1. Go to Trust API → Getting Started

  2. Generate your API Key (keep this secret)

  3. Copy the base URL (https://api.truthvouch.ai/v1)

  4. Install the SDK for your language:

Terminal window
# Python
pip install truthvouch-trust
# TypeScript/Node
npm install @truthvouch/trust
# C#/.NET
dotnet add package TruthVouch.Trust

Step 2: Integrate Fact-Checking into Your RAG Pipeline

Add a verification step after LLM generation.

Python Example

from openai import OpenAI
from truthvouch_trust import TruthVouch
# Initialize clients
llm = OpenAI(api_key="sk-...")
truth = TruthVouch(api_key="vt-api-key")
def answer_question_with_verification(user_question: str) -> dict:
# Step 1: Retrieve context from your knowledge base
context = retrieve_from_knowledge_base(user_question)
# Step 2: Generate answer with context
response = llm.chat.completions.create(
model="gpt-4",
messages=[
{
"role": "system",
"content": f"Answer questions about our company using this context: {context}"
},
{"role": "user", "content": user_question}
]
)
answer = response.choices[0].message.content
# Step 3: Verify claims in the answer
result = truth.verify(
text=answer,
context="company_information", # Your fact category
threshold=0.8 # Confidence threshold
)
# Step 4: Return result based on verification
return {
"answer": answer,
"is_accurate": result.is_accurate,
"verified_claims": result.verified_claims,
"unverified_claims": result.unverified_claims,
"confidence_score": result.confidence,
"warnings": result.warnings or []
}
# Usage
result = answer_question_with_verification("When was TruthVouch founded?")
if result["is_accurate"]:
print(f"Verified answer: {result['answer']}")
else:
print(f"Warning: Answer contains unverified claims: {result['warnings']}")
# Option 1: Return the answer anyway with a warning badge
# Option 2: Request a new answer from the LLM
# Option 3: Return only the verified claims

TypeScript Example

import OpenAI from "openai";
import { TruthVouch } from "@truthvouch/trust";
const llm = new OpenAI({ apiKey: "sk-..." });
const truth = new TruthVouch({ apiKey: "vt-api-key" });
async function answerQuestionWithVerification(userQuestion: string) {
// Step 1: Retrieve context
const context = await retrieveFromKnowledgeBase(userQuestion);
// Step 2: Generate answer
const response = await llm.chat.completions.create({
model: "gpt-4",
messages: [
{
role: "system",
content: `Answer using this context: ${context}`
},
{ role: "user", content: userQuestion }
]
});
const answer = response.choices[0].message.content;
// Step 3: Verify
const result = await truth.verify({
text: answer,
context: "company_information",
threshold: 0.8
});
// Step 4: Return with verification status
return {
answer,
isAccurate: result.isAccurate,
verifiedClaims: result.verifiedClaims,
unverifiedClaims: result.unverifiedClaims,
confidenceScore: result.confidence,
warnings: result.warnings || []
};
}

Step 3: Handle Different Verification Outcomes

Depending on verification results, you can handle responses differently.

Outcome 1: Fully Accurate (confidence > 95%)

  • Return answer as-is
  • Display “Verified” badge to users
  • Log for analytics (building trust)

Outcome 2: Mostly Accurate (70-95% confidence)

  • Return answer with “Mostly Verified” disclaimer
  • Highlight unverified claims
  • Option to show sources

Outcome 3: Partially Accurate (50-70%)

  • Return answer with warning: “Some claims unverified”
  • Show verified vs. unverified claims separately
  • Recommend user check primary sources

Outcome 4: Inaccurate or Hallucinated (< 50%)

  • Don’t return answer
  • Log as hallucination alert
  • Option: Request LLM to regenerate with more constraints
  • Option: Return only verified claims

Example handler:

def handle_verification_result(result, answer):
if result.confidence > 0.95:
return {
"answer": answer,
"status": "verified",
"badge": "Verified by TruthVouch"
}
elif result.confidence > 0.70:
return {
"answer": answer,
"status": "mostly_verified",
"badge": "Mostly Verified",
"unverified_claims": result.unverified_claims
}
elif result.confidence > 0.50:
return {
"answer": f"Verified: {result.verified_claims}\n\nUnverified: {result.unverified_claims}",
"status": "partially_verified",
"badge": "Partially Verified - See Sources"
}
else:
return {
"answer": "Unable to verify this answer. Please check our documentation.",
"status": "unverified",
"alert": "Potential hallucination detected",
"log_incident": True
}

Step 4: Monitor Fact-Checking Metrics

Track your RAG pipeline’s accuracy over time.

  1. Go to Trust API → Usage Dashboard

  2. Monitor:

    • Total verifications: How many fact-checks did you run?
    • Accuracy rate: % of verified vs. unverified claims
    • Confidence distribution: Histogram of confidence scores
    • API latency: P50, P95, P99 response times
    • Cost: How many API calls, cost per call
  3. Drill down by:

    • Context: Which fact categories are most accurate?
    • Time: Is accuracy trending up or down?
    • User: Which users are asking most accurate questions?

Example metrics:

This week:
- 4,200 verifications
- 92% fully verified (>95% confidence)
- 6% partially verified (50-95%)
- 2% hallucinations detected
- Avg latency: 180ms
- Cost: $126 for the week

Step 5: Set Up Alerts for Hallucinations

Configure alerts when hallucinations are detected so your team can investigate.

  1. Go to Trust API → Alerts

  2. Create alert rules:

    • Alert when confidence drops below 70%
    • Alert when hallucinations detected (confidence < 50%)
    • Alert on specific topics (e.g., pricing, competitor claims)
  3. Configure channels:

    • Slack: Post to #rag-monitoring channel
    • Email: Daily digest of hallucinations
    • Webhook: Custom integration with your logging system
  4. Example alert:

Hallucination detected in RAG pipeline:
Question: "What was TruthVouch's founding date?"
Answer: "Founded in 2020"
Verified fact: "Founded in 2021"
Confidence: 15%
User: john@company.com
Action: Answer blocked from user

Step 6: Iterate on Your Knowledge Base

Use verification data to improve your Truth Nuggets.

  1. Monthly review of unverified claims:

    • Which topics are most commonly hallucinated?
    • Are your Truth Nuggets detailed enough?
    • Do you need more specific facts?
  2. Update Truth Nuggets based on patterns:

    • “Pricing starts at $349/month” → Add details by tier
    • “Founded in 2021” → Add “March 2021” for more precision
    • Add sources/citations to facts that often trigger low confidence
  3. Retrain the LLM (optional):

    • If confidence is low despite accurate facts, your LLM may need fine-tuning
    • Add verified Q&A pairs to training data
    • Use TruthVouch’s retrieval context as in-context examples

Real-World Example

Scenario: You built a customer support chatbot that answers product questions.

  1. Integrate fact-checking:

    • Questions about pricing, features, billing cycle get verified
    • Unverified answers are flagged to human support
  2. Monitor over 1 week:

    • 500 customer questions answered
    • 94% fully verified
    • 4% partially verified
    • 2% hallucinations (customer asked about features that don’t exist)
  3. Improve:

    • Update Truth Nuggets with more detailed feature descriptions
    • Add FAQs for commonly misunderstood features
    • Train support team on handling hallucination alerts
  4. Results:

    • After 2 weeks: 98% fully verified
    • Customer satisfaction improves (accurate answers)
    • Support team spends less time correcting bot errors

Integration Patterns

Pattern 1: Block Hallucinations

if not result.is_accurate:
return {"error": "Unable to answer. Please contact support."}
else:
return {"answer": answer}

Pattern 2: Show Confidence Badges

if result.confidence > 0.9:
badge = "Verified"
elif result.confidence > 0.7:
badge = "Mostly Verified"
else:
badge = "Check Sources"
return {"answer": answer, "badge": badge}

Pattern 3: Return Only Verified Claims

if result.confidence < 0.7:
verified_only = format_verified_claims(result.verified_claims)
return {"answer": verified_only, "note": "Some claims could not be verified"}
else:
return {"answer": answer}

Pattern 4: Regenerate on Hallucination

for attempt in range(3):
answer = llm.generate_with_context(question, context)
result = truth.verify(answer)
if result.is_accurate:
return answer
# If hallucination, try again with more constraints
return {"error": "Unable to generate verified answer"}

Performance Considerations

  • Latency: Fact-checking adds ~150-250ms per response. Acceptable for most applications.
  • Cost: Each verification call costs $0.03. 1000 verifications/month = ~$30/month.
  • Throughput: API can handle 1000 requests/second. Cache frequent queries.

Optimization tips:

  • Cache verification results for identical questions
  • Batch verify multiple claims in one call
  • Use lower confidence threshold for non-critical answers
  • Disable fact-checking for non-factual questions (opinions, summaries)

Next Steps

Questions? Reach out to your Solutions Engineer or post in the in-app support chat.