Finding the Needle

We have cleaned our data and built our Knowledge Base. Now, a user asks a question. How do we ensure that the absolute best pieces of information are retrieved and packaged for the Foundation Model?

In the AWS Certified Generative AI Developer – Professional exam, you must understand the nuance of the Retrieve and RetrieveAndGenerate APIs. This is where you control the "intelligence" of your RAG system through retrieval configuration.

1. Retrieve vs. RetrieveAndGenerate

Amazon Bedrock provides two distinct ways to interact with your Knowledge Base:

The `Retrieve` API (Control)

What it does: Searches the vector store and returns the raw chunks (and their metadata).
Why use it: You want full control over the final prompt. You might want to use a specific model not supported by the higher-level API, or you want to "re-rank" the results yourself.

The `RetrieveAndGenerate` API (Convenience)

What it does: Does the search, builds the prompt, calls the model, and returns the final answer in one call.
Why use it: Speed of development. AWS manages the "Context Assembly" logic for you.

2. Optimizing Retrieval Parameters

When you perform a search, you have two primary levers to pull:

K (Maximum Number of Results)

Low K (1-3): Fast and cheap. Good for high-precision, single-fact answers.
High K (10-20): More comprehensive. Better for deep summaries or complex reasoning across documents.
Warning: Too high a K can exceed the model's context window or lead to "Middle-of-the-document" forgetfulness.

Retrieval Filter (Metadata Filtering)

Using the metadata we injected during indexing, we can restrict the search.

Example: "Retrieve chunks where security_clearance is 'Level 2' AND category is 'Sales'."

3. The Power of Hybrid Search

Classic "Semantic Search" (Vectors) is great for concepts. But it is surprisingly bad at finding specific keywords like "SKU-99421" or "Project X-Ray."

Hybrid Search combines:

Vector Search: Finds things that mean the same.
Keyword Search: Finds things that match exactly.

AWS Implementation: Amazon OpenSearch Service supports hybrid search, allowing you to give weight to both types of results.

4. Context Assembly and Prompt Templates

How the context is presented to the model matters. This is known as Context Assembly.

graph TD
    User[User Query] --> R[Retrieve Top K Chunks]
    R --> Filter[Metadata Filtering]
    Filter --> Template[Apply Prompt Template]
    Template --> FM[Foundation Model]
    
    subgraph Prompt_Template_Example
    T1[Instructions]
    T2[Retrieved Context]
    T3[User Question]
    end

The "No Answer" Guardrail

An essential part of context assembly is telling the model what to do when it doesn't find the answer.

Bad Prompt: "Answer the question." (Model will hallucinate if it doesn't know).
Pro Prompt: "Using ONLY the provided text, answer the question. If the information is not present, state exactly: 'I cannot find this information in the knowledge base.' Do not use outside knowledge."

5. Code Example: Retrieval in Action (Boto3)

import boto3

client = boto3.client('bedrock-agent-runtime')

def ask_my_knowledge_base(query, kb_id):
    # Using the high-level RetrieveAndGenerate API
    response = client.retrieve_and_generate(
        input={'text': query},
        retrieveAndGenerateConfiguration={
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kb_id,
                'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': 5 # This is our 'K'
                    }
                }
            }
        }
    )
    
    return response['output']['text']

# Usage
answer = ask_my_knowledge_base("What is the return policy for electronics?", "KB_12345")
print(answer)

6. Identifying Retrieval Failure

In the exam, you will see a question about "Grounding Errors."

Symptom: The AI provides a beautiful answer, but it's factually wrong.
Cause: The retrieval step returned "Low Confidence" or irrelevant chunks.
Solution: Adjust the Similarity Threshold or implement Reranking to ensure only high-quality data reaches the prompt.

Knowledge Check: Test Your Retrieval Skills

Error: Quiz options are missing or invalid.

Summary

You have now mastered the art of retrieval. This concludes Module 4. In the final module of Domain 1, we will look at Compliance and Data Governance, ensuring your Knowledge Base doesn't become a security liability.

Next Module: The Fortress: Handling Sensitive Data and Privacy

The Precision Search: Context Assembly and Semantic Retrieval

Finding the Needle

1. Retrieve vs. RetrieveAndGenerate

The `Retrieve` API (Control)

The `RetrieveAndGenerate` API (Convenience)

2. Optimizing Retrieval Parameters

K (Maximum Number of Results)

Retrieval Filter (Metadata Filtering)

3. The Power of Hybrid Search

4. Context Assembly and Prompt Templates

The "No Answer" Guardrail

5. Code Example: Retrieval in Action (Boto3)

6. Identifying Retrieval Failure

Knowledge Check: Test Your Retrieval Skills

Summary

Subscribe to our newsletter

Finding the Needle

1. Retrieve vs. RetrieveAndGenerate

The Retrieve API (Control)

The RetrieveAndGenerate API (Convenience)

2. Optimizing Retrieval Parameters

K (Maximum Number of Results)

Retrieval Filter (Metadata Filtering)

3. The Power of Hybrid Search

4. Context Assembly and Prompt Templates

The "No Answer" Guardrail

5. Code Example: Retrieval in Action (Boto3)

6. Identifying Retrieval Failure

Knowledge Check: Test Your Retrieval Skills

Summary

Subscribe to our newsletter

The `Retrieve` API (Control)

The `RetrieveAndGenerate` API (Convenience)