Module 6 Lesson 2: Managing Chat State
Memory in the Cloud. How to maintain conversation context across multiple API requests.
Remembering the Past: Chat History
The Converse API is Stateless. If a user says "Who is Steve Jobs?" and then asks "When was he born?", Bedrock doesn't know who "he" is unless you send the first question back again.
1. The Handoff Pattern
In a web app, the Client (the browser) usually stores the history and sends the whole list of messages to the API every time.
2. Updated API with History
class ChatMessage(BaseModel):
role: str
content: str
class StatefulChatRequest(BaseModel):
history: list[ChatMessage]
new_message: str
@app.post("/chat/state")
async def chat_with_state(request: StatefulChatRequest):
# Convert Pydantic models to Bedrock JSON format
messages = []
for m in request.history:
messages.append({"role": m.role, "content": [{"text": m.content}]})
# Add the new message
messages.append({"role": "user", "content": [{"text": request.new_message}]})
response = client.converse(
modelId="anthropic.claude-3-haiku-20240307-v1:0",
messages=messages
)
return {"reply": response["output"]["message"]["content"][0]["text"]}
3. Visualizing context growth
graph TD
T1[Q1: Hello] --> B[Response 1]
T2[Q1 + A1 + Q2: Who are you?] --> B2[Response 2]
T3[Q1 + A1 + Q2 + A2 + Q3: ... ] --> B3[Response 3]
Note[History grows with every turn]
4. The Context Wall
You cannot send infinite history. Every model has a Context Window limit (e.g., 200,000 tokens).
- Rule: Keep only the last 10-20 messages to save cost and avoid hitting the limit.
Summary
- Bedrock is Stateless (identical requests get identical answers).
- History must be managed by the application layer.
- The Converse API expects the full list of previous turns in every call.
- Trimming history is essential for cost and performance.