The Precision Search: Context Assembly and Semantic Retrieval

The Precision Search: Context Assembly and Semantic Retrieval

Master the final stage of the RAG pipeline. Learn how to optimize retrieval parameters, implement hybrid search, and assemble the perfect context for your model.

Finding the Needle

We have cleaned our data and built our Knowledge Base. Now, a user asks a question. How do we ensure that the absolute best pieces of information are retrieved and packaged for the Foundation Model?

In the AWS Certified Generative AI Developer – Professional exam, you must understand the nuance of the Retrieve and RetrieveAndGenerate APIs. This is where you control the "intelligence" of your RAG system through retrieval configuration.


1. Retrieve vs. RetrieveAndGenerate

Amazon Bedrock provides two distinct ways to interact with your Knowledge Base:

The Retrieve API (Control)

  • What it does: Searches the vector store and returns the raw chunks (and their metadata).
  • Why use it: You want full control over the final prompt. You might want to use a specific model not supported by the higher-level API, or you want to "re-rank" the results yourself.

The RetrieveAndGenerate API (Convenience)

  • What it does: Does the search, builds the prompt, calls the model, and returns the final answer in one call.
  • Why use it: Speed of development. AWS manages the "Context Assembly" logic for you.

2. Optimizing Retrieval Parameters

When you perform a search, you have two primary levers to pull:

K (Maximum Number of Results)

  • Low K (1-3): Fast and cheap. Good for high-precision, single-fact answers.
  • High K (10-20): More comprehensive. Better for deep summaries or complex reasoning across documents.
  • Warning: Too high a K can exceed the model's context window or lead to "Middle-of-the-document" forgetfulness.

Retrieval Filter (Metadata Filtering)

Using the metadata we injected during indexing, we can restrict the search.

  • Example: "Retrieve chunks where security_clearance is 'Level 2' AND category is 'Sales'."

3. The Power of Hybrid Search

Classic "Semantic Search" (Vectors) is great for concepts. But it is surprisingly bad at finding specific keywords like "SKU-99421" or "Project X-Ray."

Hybrid Search combines:

  1. Vector Search: Finds things that mean the same.
  2. Keyword Search: Finds things that match exactly.

AWS Implementation: Amazon OpenSearch Service supports hybrid search, allowing you to give weight to both types of results.


4. Context Assembly and Prompt Templates

How the context is presented to the model matters. This is known as Context Assembly.

graph TD
    User[User Query] --> R[Retrieve Top K Chunks]
    R --> Filter[Metadata Filtering]
    Filter --> Template[Apply Prompt Template]
    Template --> FM[Foundation Model]
    
    subgraph Prompt_Template_Example
    T1[Instructions]
    T2[Retrieved Context]
    T3[User Question]
    end

The "No Answer" Guardrail

An essential part of context assembly is telling the model what to do when it doesn't find the answer.

  • Bad Prompt: "Answer the question." (Model will hallucinate if it doesn't know).
  • Pro Prompt: "Using ONLY the provided text, answer the question. If the information is not present, state exactly: 'I cannot find this information in the knowledge base.' Do not use outside knowledge."

5. Code Example: Retrieval in Action (Boto3)

import boto3

client = boto3.client('bedrock-agent-runtime')

def ask_my_knowledge_base(query, kb_id):
    # Using the high-level RetrieveAndGenerate API
    response = client.retrieve_and_generate(
        input={'text': query},
        retrieveAndGenerateConfiguration={
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kb_id,
                'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': 5 # This is our 'K'
                    }
                }
            }
        }
    )
    
    return response['output']['text']

# Usage
answer = ask_my_knowledge_base("What is the return policy for electronics?", "KB_12345")
print(answer)

6. Identifying Retrieval Failure

In the exam, you will see a question about "Grounding Errors."

  • Symptom: The AI provides a beautiful answer, but it's factually wrong.
  • Cause: The retrieval step returned "Low Confidence" or irrelevant chunks.
  • Solution: Adjust the Similarity Threshold or implement Reranking to ensure only high-quality data reaches the prompt.

Knowledge Check: Test Your Retrieval Skills

?Knowledge Check

A developer is building a RAG application for a global retail brand. Some items have very similar semantic descriptions (e.g., 'Blue Running Shoes'), but the system MUST distinguish between specific SKU numbers (e.g., 'RS-100-BL' vs 'RS-200-BL'). Which search configuration is most appropriate?


Summary

You have now mastered the art of retrieval. This concludes Module 4. In the final module of Domain 1, we will look at Compliance and Data Governance, ensuring your Knowledge Base doesn't become a security liability.


Next Module: The Fortress: Handling Sensitive Data and Privacy

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn