Skip to content

Chatbot Fact-Checking

Conversational AI systems must provide accurate, verifiable responses in real-time. TruthVouch Shield integrates into chatbots to detect hallucinations, verify facts against your knowledge base, and flag uncertain responses before they reach users.

Real-Time Verification Flow

TruthVouch Shield performs verification at three critical points:

  1. Pre-response verification: Check constraints before generation
  2. Streaming verification: Fact-check as tokens arrive (low-latency)
  3. Post-response verification: Deep analysis of complete responses

Architecture: Streaming Verification

For conversational systems, streaming verification reduces latency:

User Query
[LLM Stream Begins]
[Chunks Verified in Parallel] ← TruthVouch Shield
[Confidence Score Accumulates]
[Uncertain Chunks Flagged]
User Receives: Response + Flags

Implementation Examples

Python: Real-Time Chatbot

from truthvouch.shield import StreamVerificationClient
import openai
client = StreamVerificationClient(api_key="your-api-key")
async def chat_with_verification(messages: list[dict]) -> dict:
"""Stream chatbot response with real-time fact-checking."""
verified_chunks = []
confidence_scores = []
flags = []
async with openai.AsyncOpenAI() as openai_client:
# Start streaming LLM response
stream = await openai_client.chat.completions.create(
model="gpt-4",
messages=messages,
stream=True
)
# Verify chunks as they arrive
async for chunk in stream:
if chunk.choices[0].delta.content:
text = chunk.choices[0].delta.content
# Verify in background
try:
result = await client.verify_streaming_fact(
text=text,
context=messages[0].get("context", "")
)
verified_chunks.append({
"text": text,
"confidence": result.confidence,
"verified": result.confidence > 0.7
})
if result.confidence < 0.7:
flags.append({
"position": len(verified_chunks) - 1,
"reason": result.reason,
"confidence": result.confidence
})
except Exception as e:
# On verification timeout, stream anyway
verified_chunks.append({
"text": text,
"confidence": None,
"verified": None
})
# Yield to client immediately (don't wait for verification)
yield {
"chunk": text,
"verified_chunks": verified_chunks,
"flags": flags
}
return {
"full_response": "".join([c["text"] for c in verified_chunks]),
"verified_chunks": verified_chunks,
"overall_confidence": sum(c["confidence"] for c in verified_chunks
if c["confidence"]) / len(verified_chunks),
"flags": flags
}

TypeScript: Next.js Chatbot

import { StreamVerificationClient } from "@truthvouch/sdk";
import { OpenAIStream, StreamingTextResponse } from "ai";
import { openai } from "@ai-sdk/openai";
const verificationClient = new StreamVerificationClient({
apiKey: process.env.TRUTHVOUCH_API_KEY
});
export async function POST(req: Request) {
const { messages } = await req.json();
// Start LLM stream
const response = await openai.generateText({
model: "gpt-4",
messages,
stream: true
});
// Map to streaming text with verification
const stream = response.stream.pipeThrough(
new TransformStream({
async transform(chunk, controller) {
const text = chunk.text;
// Verify in background (fire-and-forget for latency)
verificationClient.verifyStreamingFact({
text,
context: messages[0].context
}).then(result => {
// Send verification metadata separately
controller.enqueue({
type: "verification",
confidence: result.confidence,
flags: result.flags
});
}).catch(err => {
console.error("Verification failed:", err);
// Continue streaming anyway
});
// Yield text immediately
controller.enqueue({
type: "text",
content: text
});
}
})
);
return new StreamingTextResponse(stream);
}

Python: Multi-Turn Conversation

from truthvouch.shield import VerificationClient
from typing import Optional
class VerifiedChatbot:
def __init__(self, api_key: str):
self.client = VerificationClient(api_key=api_key)
self.conversation = []
self.verification_cache = {}
async def chat(
self,
user_message: str,
knowledge_base_id: Optional[str] = None
) -> dict:
"""Send message and get verified response."""
self.conversation.append({
"role": "user",
"content": user_message
})
# Generate response
response = llm.generate(
messages=self.conversation,
system="You are a helpful assistant. Only cite facts from your knowledge base."
)
# Verify entire response
verification = await self.client.verify_fact(
text=response,
context=user_message,
knowledge_base_id=knowledge_base_id
)
result = {
"response": response,
"confidence": verification.confidence,
"verified": verification.confidence > 0.7,
"citations": verification.citations,
"uncertainties": verification.uncertainties
}
# Cache for this conversation
self.verification_cache[response] = verification
self.conversation.append({
"role": "assistant",
"content": response,
"metadata": {
"confidence": verification.confidence,
"verified": verification.confidence > 0.7
}
})
return result

Handling Uncertain Responses

When confidence is below threshold, present responses transparently:

function ChatbotMessage({ message, verification }) {
const isUncertain = verification.confidence < 0.7;
return (
<div className={`message ${isUncertain ? 'uncertain' : 'verified'}`}>
<p>{message.text}</p>
{isUncertain && (
<div className="uncertainty-warning">
<Icon>warning</Icon>
<span>
This response has low confidence ({(verification.confidence * 100).toFixed(0)}%).
It may contain inaccuracies. {' '}
<button onClick={() => setShowDetails(true)}>
View details
</button>
</span>
</div>
)}
{isUncertain && showDetails && (
<details className="verification-details">
<summary>Verification Details</summary>
<ul>
{verification.uncertainties.map(u => (
<li key={u.id}>
<strong>"{u.text}"</strong> - {u.reason}
</li>
))}
</ul>
</details>
)}
{verification.citations.length > 0 && (
<div className="citations">
<strong>Sources:</strong>
<ul>
{verification.citations.map(c => (
<li key={c.id}>
<a href={c.url}>{c.title}</a>
</li>
))}
</ul>
</div>
)}
</div>
);
}

Performance Optimization

Reduce Verification Latency

  • Token batching: Verify every 50 tokens instead of every token
  • Sampling: Verify 70% of chunks (skip repetitive content)
  • Caching: Reuse verification for similar queries (Redis, 5-min TTL)
class OptimizedVerificationClient:
def __init__(self, api_key: str, batch_size: int = 50):
self.client = StreamVerificationClient(api_key=api_key)
self.batch_size = batch_size
self.buffer = []
self.cache = {}
async def verify_batched(self, chunk: str) -> Optional[dict]:
self.buffer.append(chunk)
if len(self.buffer) < self.batch_size:
return None # Not enough to verify yet
# Verify buffered text
text = "".join(self.buffer)
cache_key = hash(text)
if cache_key in self.cache:
return self.cache[cache_key]
result = await self.client.verify_streaming_fact(text=text)
self.cache[cache_key] = result
self.buffer = []
return result

Quota Management

Streaming verification uses 0.01 credits per chunk. For high-traffic chatbots:

  • Implement per-user rate limits (100 messages/hour)
  • Use batch verification for off-peak processing
  • Cache results across similar conversations
  • Monitor costs and adjust sampling rate

Best Practices

User Experience

  • Show confidence visually (progress bar, color indicator)
  • Provide “Report incorrect answer” button
  • Surface citations for transparency
  • Don’t block streaming on verification (show lazy flags)

Reliability

  • Use timeouts (max 2s per verification)
  • Fall back to unverified response on timeout
  • Log all timeouts for monitoring
  • Implement exponential backoff for retries

Privacy

  • Never log full conversations in verification requests
  • Use knowledge_base_id instead of raw context
  • Implement user-level caching with proper isolation

Troubleshooting

Q: Streaming feels slower with verification

  • Switch to token batching (verify every 50 tokens)
  • Reduce context window (use only last 3 turns)
  • Implement response caching

Q: High false positive rate

  • Increase confidence threshold to 0.8
  • Add domain-specific verification rules
  • Use smaller verification chunks (100-word max)

Q: Verification timeout on long responses

  • Use token sampling instead of full verification
  • Implement streaming timeout fallback
  • Consider batch verification for post-processing

Next Steps