
Long-Term Memory: Scaling with External Databases
Learn how to build 'Infinity-Scale' memory. Master the integration of Postgres, Redis, and Graph databases into your agentic token workflow.
Long-Term Memory: Scaling with External Databases
The human brain doesn't keep every memory in "Active Consciousness." It moves data to long-term storage and retrieves it only when a stimulus triggers it. Your AI agents should do the same.
In this lesson, we learn how to use External Databases as the "Long-Term Memory" (LTM) for your agentic fleet. We will explore Redis for hot state, Postgres for relational history, and Knowledge Graphs for complex fact relationship retrieval.
1. The Distributed Memory Cache (Redis)
For "Hot" state that multiple agents need to access quickly (e.g., the current task status), use Redis.
- Token Efficiency: Instead of Agent A passing the state to Agent B (2,000 tokens), Agent A writes to Redis and Agent B reads from Redis.
- Savings: You eliminate the "Hand-off Tax" between agents.
2. The Semantic Archive (Vector DB)
As discussed in Module 6.2, your chat history should be a vector database.
- The Optimization: Don't just store "What was said." Store "Why it was important."
- Use a small model to generate "Memory Keys" (e.g.
memory: project_deadline) for every turn. This makes retrieval faster and more accurate, reducing the number of "Irrelevant Chunks" you pull into the context.
3. The Relational Backbone (Postgres/SQL)
For structured facts (Module 11.2), Postgres is the gold standard.
- The Efficiency Pattern: Give your agent a
query_memoriestool. - The agent doesn't need to "Guess" what happened last week. It runs a SQL query:
SELECT fact FROM agent_memories WHERE topic = 'architecture'.
graph TD
A[Agent] -->|Tool Call| B[FastAPI Backend]
B --> C{Memory Router}
C -->|ID Match| D[(Redis: Hot State)]
C -->|Semantic| E[(Pinecone: Vectors)]
C -->|Relational| F[(Postgres: Facts)]
D & E & F -->|Precise Data| B
B -->|Verified Fact| A
4. Knowledge Graphs (The Advanced Tier)
For complex agents (e.g. "Personal Investment Advisor"), simple text retrieval isn't enough. You need to know that "Company X" is owned by "Company Y."
- Graph Efficiency: Instead of sending 10 documents about Company Y, you send the Graph Relationship Map (10 tokens).
- The LLM identifies the connection instantly without needing to read thousands of words to "Infer" the relationship.
5. Implementation: The Memory Router (Python)
Python Code: Multi-Tier Memory Retrieval
def retrieve_agent_memory(query, user_id):
# Tier 1: Exact Key Match (Redis)
# Check if this is a known variable
short_term = redis_client.get(f"state:{user_id}")
if short_term: return short_term
# Tier 2: Relational Search (SQL)
# Look for structured facts
facts = db.query("SELECT content FROM facts WHERE user_id = ?", (user_id,))
if facts: return facts
# Tier 3: Semantic Search (Vectors)
# Fallback to general history
return vector_search(query, user_id)
6. Summary and Key Takeaways
- Hierarchy is Key: Move from hot (Redis) to structured (SQL) to semantic (Vector).
- Retrieve, don't Carry: Every token you don't carry in the "Hand-off" is money saved.
- Query Interface: Give the agent tools to browse its own memory.
- Relational Clarity: Use SQL for facts that require 100% accuracy (dates, prices, names).
In the next lesson, Managing Large 'Reasoning' Logs, we conclude Module 11 by looking at چگونه to store the agent's thoughts without breaking the budget.
Exercise: The Database Bridge
- Create a simple SQLite table
memories. - Ask an LLM to "Save the fact that my dog's name is Rex."
- The Agent should call a tool to save it to SQL.
- In a NEW session (Empty Context), ask the agent: "What is my dog's name?"
- The Agent should call a tool to read from SQL.
- Analyze: What was the total context size of the second session? (It will be tiny compared to carrying the 'Rex' fact in every turn).