
Audit Logging in RAG Systems
Implement a comprehensive logging strategy to track data lineage, user queries, and system responses for compliance.
Audit Logging in RAG Systems
If your RAG system makes a mistake, you need to know who asked the question, what documents were retrieved, and why the model gave the answer it did. This is the goal of Audit Logging.
What to Log?
A complete RAG log entry should include:
- Timestamp & User ID.
- The Original Query.
- The "Pre-processed" Query (if you used Multi-query or Rewriting).
- Retrieved Document IDs & their Similarity Scores.
- Applied Metadata Filters.
- The Full Prompt Sent to the LLM (including System Prompt).
- The Generated Response.
- Confidence Score / Verification Results.
Implementation with Structured Logging
import json
import logging
def log_rag_event(event):
# Log to CloudWatch or a dedicated Audit DB
logging.info(json.dumps(event))
# Example Event
rag_log = {
"user": "user_123",
"query": "What is the policy on X?",
"retrieved": ["doc_A_p5", "doc_B_p2"],
"scores": [0.92, 0.88],
"model": "claude-3-5-sonnet",
"result": "the policy is..."
}
Privacy in Logs
Warning: Never log raw PII. If the user's query contains an email address, redact it before it hits the audit logs.
Using Logs for "Ground Truth"
Over time, these logs become your "Golden Dataset." You can review the logs, identify perfect answers, and use them to test future versions of your RAG system.
Compliance Requirements
- Retention: How long must you keep logs? (e.g., 7 years for Finance).
- Immutability: Logs should be stored in a "Write-Once-Read-Many" (WORM) storage to prevent tampering.
Exercises
- Design a SQL schema for storing RAG audit logs.
- Why is logging the "System Prompt" important for debugging?
- How can you use logs to identify "Dead documents" (documents that are never retrieved)?