Skip to content

Testing Policies

Overview

Test every policy before deployment. TruthVouch provides a test harness to validate policies work correctly on sample data.

Quick Test

  1. Go to GovernancePolicies → [Policy] → Test
  2. Enter test input:
    {
    "type": "request",
    "user_id": "user_123",
    "model": "gpt-4",
    "tokens": 5000,
    "text": "What is AI?"
    }
  3. Click Run Test
  4. See if policy triggers or allows

Test Cases

Create multiple test cases to cover scenarios:

Test Case 1: Should Block

Input:
type: request
text: "My SSN is 123-45-6789"
Expected: BLOCKED
Expected Message: Contains PII

Test Case 2: Should Allow

Input:
type: request
text: "What is machine learning?"
Expected: ALLOWED

Test Case 3: Edge Case

Input:
type: request
text: "The example SSN format is 123-45-6789"
Expected: ? (Check if false positive)

Test Input Format

Structure test input to match your policy:

{
"type": "request", // or "response"
"user_id": "user_123",
"model": "gpt-4",
"tokens": 5000,
"text": "...",
"destination": "external",
"user_type": "internal",
"safety_flags": {
"toxicity": 0.2,
"bias": 0.1
}
}

Use fields your policy actually checks.

Test Coverage

Aim for 100% coverage:

  • Happy Path: Normal, allowed input
  • Violation Path: Input that should trigger denial
  • Edge Cases: Boundary conditions
  • False Positives: Legitimate input that might wrongly trigger

Example for “Block API Keys” Policy:

Test 1: Normal text
Input: "How do I use the API?"
Expected: ALLOWED
Test 2: Exact API key
Input: "api_key=sk_live_abc123def456ghi789"
Expected: BLOCKED
Test 3: Partial key (false positive check)
Input: "Documentation: api_key parameter"
Expected: ALLOWED
Test 4: Encoded key
Input: "ak_prod_6f7c8d9e0a1b2c3d4e5f6a7b8c9d0e"
Expected: ?

Batch Testing

Test multiple cases at once:

  1. Click + Add Test Case
  2. Enter name, input, expected result
  3. Repeat for all cases
  4. Click Run All Tests
  5. See pass/fail for each

Results:

Test Suite: API Key Protection
✓ Test 1: Normal text - PASSED
✓ Test 2: Exact API key - PASSED
✓ Test 3: Partial key - PASSED
⚠ Test 4: Encoded key - FAILED (expected BLOCKED, got ALLOWED)
Pass Rate: 75% (3/4)

If any fail, adjust policy logic and retest.

Dry-Run Mode

Deploy policy without enforcement. Logs violations without blocking.

  1. Go to GovernancePolicies → [Policy]
  2. Click Deploy
  3. Select Dry-Run Mode
  4. Set duration: 24 hours, 7 days, etc.
  5. Click Deploy to Dry-Run

What happens:

  • Policy evaluated on all requests
  • Violations logged to audit trail
  • User request proceeds (not blocked)
  • You see real data on policy impact

After dry-run:

  1. Go to ReportsPolicy Impact
  2. See how many violations would have been blocked
  3. Check for false positives
  4. If good: Change to enforcement mode
  5. If issues: Adjust policy, retest

Test with Real Data

The best test uses real requests from your system:

  1. Go to GovernanceAudit Trail
  2. Find recent requests
  3. Export as test data:
    [
    { "user_id": "user_123", "model": "gpt-4", "text": "..." },
    { "user_id": "user_456", "model": "claude", "text": "..." }
    ]
  4. Upload to policy test
  5. Run tests against real data
  6. Validate policy works correctly

Regression Testing

Before modifying a policy, save a baseline:

  1. Create test suite with 10-20 representative cases
  2. Run tests, note all pass
  3. Modify policy
  4. Run same tests again
  5. If any now fail (regression), fix the issue
  6. Once all pass, commit changes

Performance Testing

Check if policy adds excessive latency:

  1. Go to ReportsPolicy Performance

  2. See latency added by policy:

    Policy: Block PII
    Avg latency: 8ms
    P95 latency: 25ms
    P99 latency: 50ms
  3. If >100ms, optimize:

    • Simplify regex patterns
    • Cache data lookups
    • Disable unnecessary checks

Testing Rego Policies

For complex Rego policies, test thoroughly:

package policies.complex_rule
deny[msg] {
# Complex logic with multiple conditions
user_id := input.user_id
user_dept := data.departments[user_id]
monthly_tokens := data.monthly_usage[user_dept]
budget := data.dept_budgets[user_dept]
monthly_tokens + input.tokens > budget
msg := "Budget exceeded"
}

Test Cases:

Case 1: User under budget
Input: dept=eng, tokens=1000, used=40000, budget=100000
Expected: ALLOWED
Case 2: User would exceed budget
Input: dept=eng, tokens=60000, used=40000, budget=100000
Expected: BLOCKED
Case 3: Different department
Input: dept=marketing, tokens=5000, used=50000, budget=100000
Expected: ALLOWED

API Testing

Test policies via API:

Terminal window
curl -X POST http://localhost:5000/api/v1/governance/policies/test \
-H "Authorization: Bearer $TOKEN" \
-d '{
"policy_id": "policy_123",
"test_input": {
"type": "request",
"user_id": "user_123",
"text": "..."
}
}'

Response:

{
"policy_id": "policy_123",
"triggered": false,
"message": null,
"latency_ms": 8
}

Deployment Checklist

Before deploying a policy:

  • All test cases pass
  • No false positives identified
  • Performance acceptable (<100ms)
  • Approval obtained (if required)
  • Stakeholders notified
  • Rollback plan documented
  • Monitoring alerts configured

Common Testing Issues

Policy Not Triggering

Problem: Policy should fire but doesn’t.

Solution:

  1. Check test input matches policy fields
  2. Verify condition logic in policy
  3. Use debug print() statements
  4. Check data dependencies exist

False Positives

Problem: Legitimate input wrongly blocked.

Solution:

  1. Add to allowlist
  2. Adjust pattern/threshold
  3. Add exceptions for specific cases
  4. Test with more edge cases

Performance Issues

Problem: Policy adds too much latency.

Solution:

  1. Simplify regex patterns
  2. Cache expensive lookups
  3. Reduce scope (don’t apply to all users)
  4. Profile the policy