Module 8 Lesson 2: Source Attribution
Proving the Answer. How to extract citations and references to show users exactly where the AI found its information.
Citations: The Proof of Truth
In a professional setting, "Because the AI said so" is not an acceptable answer. Users need to see the Evidence. Bedrock Knowledge Bases provide granular Citations that map specific sentences in the answer to specific pages in your PDFs.
1. Extracting Citations
The retrieve_and_generate response contains a citations key.
# Assuming you ran the code from Lesson 1
citations = response.get("citations", [])
for citation in citations:
for reference in citation["retrievedReferences"]:
# The actual snippet used
text = reference["content"]["text"]
# The S3 path or location
source = reference["location"]["s3Location"]["uri"]
print(f"Source: {source}")
print(f"Snippet: {text[:100]}...")
2. Why Citations Matter
- Trust: Users are more likely to trust the AI if they can click a link to the PDF.
- Safety: If an AI gives wrong advice, citations allow a human expert to see where the logic broke.
- Copyright: Proper attribution for the source data.
3. Visualizing the Linkage
graph LR
Answer["The insurance covers dental [1]"]
Answer -.-> Cit["[1] Page 4 of Policy.pdf"]
Cit --> PDF[Original Document]
4. Metadata and Filtering
You can add custom Metadata to your S3 files (e.g., department: "HR").
When you query the KB, you can Filter the results:
"filter": {
"equals": {
"key": "department",
"value": "HR"
}
}
This ensures a Finance employee doesn't accidentally see HR files during their search.
Summary
- Citations provide verifiable evidence for AI answers.
retrievedReferencescontain the location and snippet for each turn.- Metadata allows for fine-grained security filtering during retrieval.
- Displaying sources is a mandatory UX Best Practice for RAG apps.