Testing Policies
Overview
Test every policy before deployment. TruthVouch provides a test harness to validate policies work correctly on sample data.
Quick Test
- Go to Governance → Policies → [Policy] → Test
- Enter test input:
{"type": "request","user_id": "user_123","model": "gpt-4","tokens": 5000,"text": "What is AI?"}
- Click Run Test
- See if policy triggers or allows
Test Cases
Create multiple test cases to cover scenarios:
Test Case 1: Should Block
Input: type: request text: "My SSN is 123-45-6789"Expected: BLOCKEDExpected Message: Contains PIITest Case 2: Should Allow
Input: type: request text: "What is machine learning?"Expected: ALLOWEDTest Case 3: Edge Case
Input: type: request text: "The example SSN format is 123-45-6789"Expected: ? (Check if false positive)Test Input Format
Structure test input to match your policy:
{ "type": "request", // or "response" "user_id": "user_123", "model": "gpt-4", "tokens": 5000, "text": "...", "destination": "external", "user_type": "internal", "safety_flags": { "toxicity": 0.2, "bias": 0.1 }}Use fields your policy actually checks.
Test Coverage
Aim for 100% coverage:
- Happy Path: Normal, allowed input
- Violation Path: Input that should trigger denial
- Edge Cases: Boundary conditions
- False Positives: Legitimate input that might wrongly trigger
Example for “Block API Keys” Policy:
Test 1: Normal textInput: "How do I use the API?"Expected: ALLOWED
Test 2: Exact API keyInput: "api_key=sk_live_abc123def456ghi789"Expected: BLOCKED
Test 3: Partial key (false positive check)Input: "Documentation: api_key parameter"Expected: ALLOWED
Test 4: Encoded keyInput: "ak_prod_6f7c8d9e0a1b2c3d4e5f6a7b8c9d0e"Expected: ?Batch Testing
Test multiple cases at once:
- Click + Add Test Case
- Enter name, input, expected result
- Repeat for all cases
- Click Run All Tests
- See pass/fail for each
Results:
Test Suite: API Key Protection✓ Test 1: Normal text - PASSED✓ Test 2: Exact API key - PASSED✓ Test 3: Partial key - PASSED⚠ Test 4: Encoded key - FAILED (expected BLOCKED, got ALLOWED)
Pass Rate: 75% (3/4)If any fail, adjust policy logic and retest.
Dry-Run Mode
Deploy policy without enforcement. Logs violations without blocking.
- Go to Governance → Policies → [Policy]
- Click Deploy
- Select Dry-Run Mode
- Set duration: 24 hours, 7 days, etc.
- Click Deploy to Dry-Run
What happens:
- Policy evaluated on all requests
- Violations logged to audit trail
- User request proceeds (not blocked)
- You see real data on policy impact
After dry-run:
- Go to Reports → Policy Impact
- See how many violations would have been blocked
- Check for false positives
- If good: Change to enforcement mode
- If issues: Adjust policy, retest
Test with Real Data
The best test uses real requests from your system:
- Go to Governance → Audit Trail
- Find recent requests
- Export as test data:
[{ "user_id": "user_123", "model": "gpt-4", "text": "..." },{ "user_id": "user_456", "model": "claude", "text": "..." }]
- Upload to policy test
- Run tests against real data
- Validate policy works correctly
Regression Testing
Before modifying a policy, save a baseline:
- Create test suite with 10-20 representative cases
- Run tests, note all pass
- Modify policy
- Run same tests again
- If any now fail (regression), fix the issue
- Once all pass, commit changes
Performance Testing
Check if policy adds excessive latency:
-
Go to Reports → Policy Performance
-
See latency added by policy:
Policy: Block PIIAvg latency: 8msP95 latency: 25msP99 latency: 50ms -
If >100ms, optimize:
- Simplify regex patterns
- Cache data lookups
- Disable unnecessary checks
Testing Rego Policies
For complex Rego policies, test thoroughly:
package policies.complex_rule
deny[msg] { # Complex logic with multiple conditions user_id := input.user_id user_dept := data.departments[user_id] monthly_tokens := data.monthly_usage[user_dept] budget := data.dept_budgets[user_dept]
monthly_tokens + input.tokens > budget msg := "Budget exceeded"}Test Cases:
Case 1: User under budget Input: dept=eng, tokens=1000, used=40000, budget=100000 Expected: ALLOWED
Case 2: User would exceed budget Input: dept=eng, tokens=60000, used=40000, budget=100000 Expected: BLOCKED
Case 3: Different department Input: dept=marketing, tokens=5000, used=50000, budget=100000 Expected: ALLOWEDAPI Testing
Test policies via API:
curl -X POST http://localhost:5000/api/v1/governance/policies/test \ -H "Authorization: Bearer $TOKEN" \ -d '{ "policy_id": "policy_123", "test_input": { "type": "request", "user_id": "user_123", "text": "..." } }'Response:
{ "policy_id": "policy_123", "triggered": false, "message": null, "latency_ms": 8}Deployment Checklist
Before deploying a policy:
- All test cases pass
- No false positives identified
- Performance acceptable (<100ms)
- Approval obtained (if required)
- Stakeholders notified
- Rollback plan documented
- Monitoring alerts configured
Common Testing Issues
Policy Not Triggering
Problem: Policy should fire but doesn’t.
Solution:
- Check test input matches policy fields
- Verify condition logic in policy
- Use debug
print()statements - Check data dependencies exist
False Positives
Problem: Legitimate input wrongly blocked.
Solution:
- Add to allowlist
- Adjust pattern/threshold
- Add exceptions for specific cases
- Test with more edge cases
Performance Issues
Problem: Policy adds too much latency.
Solution:
- Simplify regex patterns
- Cache expensive lookups
- Reduce scope (don’t apply to all users)
- Profile the policy