Module 8 Lesson 1: Retrieve and Generate
The RAG APIs. Understanding the difference between raw retrieval and the fully managed 'Answer' API.
Talking to your Data: The RAG APIs
Once your Knowledge Base (KB) is synced, you can query it using two different methods: retrieve (Just the facts) and retrieve_and_generate (The answer).
1. retrieve (The Raw Facts)
Use this if you want to handle the "Generation" yourself (e.g., using a custom prompt).
- Output: A list of the most relevant chunks from your PDFs.
- Why?: Total control over how the AI uses the data.
2. retrieve_and_generate (The Full Loop)
This is a "One-Stop-Shop" API. It finds the facts, tells the AI to read them, and gives you a complete answer in one call.
import boto3
client = boto3.client("bedrock-agent-runtime", region_name="us-east-1")
response = client.retrieve_and_generate(
input={"text": "What is the holiday policy?"},
retrieveAndGenerateConfiguration={
"type": "KNOWLEDGE_BASE",
"knowledgeBaseConfiguration": {
"knowledgeBaseId": "YOUR_KB_ID",
"modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0"
}
}
)
print(response["output"]["text"])
3. Visualizing the Decision
graph TD
User[Question] --> Mode{Choose Mode}
Mode -->|Retrieve Only| R[Raw Chunks]
R --> Custom[Custom Logic/Tool]
Mode -->|Retrieve & Generate| RG[Managed Brain]
RG --> Final[Final Human Answer]
4. Why retrieve_and_generate is usually better
- It automatically handles the System Prompt for RAG.
- It is optimized for Grounding (preventing the AI from knowing things outside your PDFs).
- It handles the Citations (Module 8 Lesson 2) automatically.
Summary
retrievereturns raw text segments.retrieve_and_generatereturns a final answer and citations.- Both APIs require the
bedrock-agent-runtimeclient. - Managed RAG is faster to build and easier to maintain.