Hosted Embeddings via Bedrock

For enterprise RAG systems running on AWS, Amazon Bedrock provides a secure way to access foundation models (like Amazon Titan and Cohere) without managing servers.

The AWS Bedrock Advantage

Security & Compliance: Data never leaves the AWS perimeter. Great for HIPAA, SOC2, and GDPR requirements.
Standardized API: One API to access multiple embedding models from different providers.
Pay-as-you-go: Direct integration with AWS billing.

Using Amazon Titan Multimodal Embeddings

The Titan Multimodal model is specifically designed for RAG systems that mix text and visual data.

Implementation with Boto3

import boto3
import json
import base64

bedrock = boto3.client(service_name='bedrock-runtime')

def get_titan_multimodal_embedding(text=None, image_path=None):
    body = {}
    if text:
        body["inputText"] = text
    if image_path:
        with open(image_path, "rb") as image_file:
            body["inputImage"] = base64.b64encode(image_file.read()).decode('utf8')

    response = bedrock.invoke_model(
        body=json.dumps(body),
        modelId="amazon.titan-embed-image-v1",
        accept="application/json",
        contentType="application/json"
    )

    response_body = json.loads(response.get('body').read())
    return response_body.get('embedding')

Model Selection on Bedrock

Model Id	Provider	Supported Modalities
`amazon.titan-embed-text-v1`	Amazon	Text
`amazon.titan-embed-image-v1`	Amazon	Text, Image
`cohere.embed-english-v3`	Cohere	Text

Best Practices for Cloud Embeddings

Error Handling: Implement exponential backoff for ThrottlingException.
Latency Optimization: Use the invoke_model endpoint for single queries and batch processing for ingestion.
Regions: Ensure your Bedrock model is in the same AWS region as your Vector Database (e.g., Pinecone or Aurora) to minimize latency.

Exercises

Set up an AWS account and enable the Titan Multimodal Embeddings model.
Use the code above to embed a local image.
What is the "Price per 1M characters" for the Titan model compared to OpenAI?