
Memory Storage Strategies: Redis, SQL, and Vector Databases
Build the infrastructure for persistent AI agents. Compare the implementation of Redis for session state, SQL for structured logs, and Vector Databases for semantic knowledge retrieval in your Gemini ADK projects.
Memory Storage Strategies: Redis, SQL, and Vector Databases
Choosing where to store your agent's memory is one of the most consequential decisions in AI systems architecture. It impacts not only the intelligence of the agent (how much it remembers) but also its latency (how fast it responds), its cost (how much data you store and retrieve), and its security (how you protect user data).
In the Gemini ADK, there is no "one size fits all" storage solution. Instead, professional architectures use a tiered approach. In this lesson, we will analyze the three primary storage strategies—Redis, SQL, and Vector Databases—and learn how to build a unified memory layer for production-grade agents.
1. The Multi-Tiered Storage Model
Think of agent storage as a hierarchy, moving from "Hot" (fast/local) to "Cold" (slow/durable).
| Tier | Technology | Purpose | Latency |
|---|---|---|---|
| L1: Hot | In-Memory / Python Dicts | Active turn history within a single request. | < 1ms |
| L2: Warm | Redis / NoSQL | Active session persistence across multiple requests. | 1-5ms |
| L3: Cold | SQL (Postgres / MySQL) | Permanent audit logs and structured relationship history. | 10-50ms |
| L4: Deep | Vector DB (Chroma / Pinecone) | Semantic search across years of historical data. | 50-200ms |
2. Redis: The King of Session State
When building a web-based agent (e.g., a chatbot on a website), users move between pages or refresh their browsers. Redis is the standard tool for ensuring the agent doesn't "reset" every time this happens.
Why Redis?
- Speed: Being in-memory, Redis provides near-instant access to the last 10 turns of a conversation.
- TTL (Time to Live): You can set a session to automatically expire after 30 minutes of inactivity, ensuring you aren't storing thousands of "dead" sessions.
- Data Structures: Redis
LISTtypes map perfectly to the Geminihistoryformat.
3. SQL: The Auditor's Choice
If you are building an agent for an enterprise (e.g., a Legal or Finance assistant), you need more than just "chat logs." You need traceability.
Why SQL (Postgres/MySQL)?
- Relational Integrity: You can link a specific "Agent Decision" to a specific "User Ticket" and a specific "Database Change."
- Searchability: You can query: "Show me all agents that used the 'Delete' tool in the last 24 hours."
- JSON Support: Modern Postgres supports JSONB, allowing you to store the raw Gemini API response while still maintaining relational links.
4. Vector Databases: Semantic Knowledge
Vector Databases are what enable RAG (Retrieval-Augmented Generation). They don't store "text"; they store "vectors" (mathematical representations of meaning).
Why Vector DBs?
- Semantic Retrieval: The agent can find information based on meaning rather than keywords. Searches for "How do I quit?" will find the "Termination of Employment Policy."
- Infinite Context: You can store billions of rows of data and retrieve only the 3 most relevant paragraphs to fit into Gemini's context window.
5. Security and Data Privacy (Encryption at Rest)
Memory storage is the #1 target for data breaches in AI systems.
- PII Masking: Before saving a turn to a database, use a middleware to redact sensitive info (names, SSNs, credit cards).
- Tenant Isolation: Ensure that Agent A (for User 1) can never "retrieve" a memory from User 2. In SQL, this is handled via
user_idforeign keys. In Vector DBs, this is handled via Metadata Filtering.
graph TD
subgraph "Agent Logic"
A[Gemini Response]
end
subgraph "The Storage Pipeline"
B{PII Filter}
C[Redis - Active Session]
D[Postgres - Audit Log]
E[Vector DB - Knowledge]
end
A --> B
B --> C
B --> D
B --> E
6. Implementation: Managing State with Redis
Let's look at a Python implementation that uses Redis to persist a Gemini chat session between different API calls.
import redis
import json
import google.generativeai as genai
# 1. Setup Redis Connection
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
def get_chat_history(session_id: str):
# Fetch from Redis
state = r.get(f"chat:{session_id}")
return json.loads(state) if state else []
def save_chat_history(session_id: str, history: list):
# Save to Redis with a 1-hour expiration
r.set(f"chat:{session_id}", json.dumps(history), ex=3600)
def agent_call(session_id: str, user_input: str):
# Setup Model
model = genai.GenerativeModel('gemini-1.5-flash')
# 2. Hydrate State
existing_history = get_chat_history(session_id)
chat = model.start_chat(history=existing_history)
# 3. Execute Turn
response = chat.send_message(user_input)
# 4. Persistence
save_chat_history(session_id, chat.history)
return response.text
# Usage:
# first_resp = agent_call("user_123", "Hi, I'm Sudeep.")
# second_resp = agent_call("user_123", "What is my name?")
# -> Will correctly return "Sudeep"
7. Strategy Comparison: When to use what?
| Use Case | Recommended Tech | Why? |
|---|---|---|
| Real-time Chatbot | Redis | High speed, ephemeral. |
| Long-running Research Agent | SQL + Vector DB | Needs to store many docs and relate them to a user. |
| Autonomous Web Scraper | In-Memory | State doesn't need to persist after the script finishes. |
| Financial Auditor Bot | SQL (Highly Structured) | Needs rigid schemas and audit trails. |
8. Summary and Exercises
Storage is the Backbone of Persistence.
- Use Redis for the "Active Conversation."
- Use SQL for the "Identity and Audit Trail."
- Use Vector Databases for the "Knowledge Base."
- Always Encrypt and Filter before you store.
Exercises
- Architecture Design: You are building an agent for a library. The agent needs to:
-
- Remember the user's name (____ Storage).
-
- Search through 1 million book summaries (____ Storage).
-
- Keep a log of every book the user ever burrowed (____ Storage).
-
- Privacy Protocol: Write a Python function that uses a Regular Expression to find and replace email addresses in a string with
[REDACTED_EMAIL]before saving to a database. - Cost Analysis: Vector Databases like Pinecone charge for the number of "Stored Vectors." If one page of text (500 words) is one vector, how much does it cost to store 10,000 pages? Comapre this to the cost of storing those 10,000 pages in a standard Postgres database.
In the next lesson, we will look at Memory Retrieval, exploring the choice between RAG and Native Long-Context in the Gemini era.