
Types of AI Memory: Short-Term, Episodic, and Semantic
Deconstruct the memory architecture of intelligent agents. Learn to distinguish between short-term context, episodic event logs, and long-term semantic knowledge to build agents that remember and learn.
Types of AI Memory: Short-Term, Episodic, and Semantic
A human without memory cannot solve complex problems because they forget the very context of the problem they are solving. The same applies to agents. If an agent "forgets" that the user is the CEO of a company midway through a task, its subsequent decisions might be inappropriate or incorrect.
In the Gemini ADK, memory is not a single database; it is a Hierarchical Architecture. To build agents that feel persistent and intelligent, you must understand the distinction between Short-term Context, Episodic Memory, and Semantic Knowledge. In this lesson, we will explore these three types of memory and how they are used to power agentic reasoning.
1. The Memory Hierarchy
AI memory is often compared to the human brain. We can map the technical components of Gemini to these biological concepts.
graph TD
subgraph "Working Memory (Fast/Small)"
A[The Context Window]
end
subgraph "Episodic Memory (The 'What Happened')"
B[Conversation Logs]
C[Action Traces]
end
subgraph "Semantic Memory (The 'What is True')"
D[Vector Databases]
E[Knowledge Graphs]
end
A <-->|Active Attention| B
B <-->|Consolidation| D
2. Short-Term Memory: The Context Window
In the Gemini ecosystem, Short-term memory is synonymous with the Context Window. This is the data presently being processed by the model's self-attention layers.
Characteristics:
- Instant Recall: The model sees $100%$ of the information perfectly.
- High Transience: Once the inference call is finished, this memory disappears unless saved externally.
- Physical Limits: Even with Gemini's 2-million token window, it is still a "bucket" that can eventually overflow.
Use Case:
Storing the immediate details of the current user conversation, recent tool outputs, and the specific plan the agent is currently executing.
3. Episodic Memory: The Personal Narrative
Episodic Memory is the record of "Episodes" or "Sessions." It answers the question: "What did the user and I do together yesterday?"
Characteristics:
- Time-Bound: Organized chronologically.
- Sequential: Preserves the cause-and-effect relationship of actions.
- Stateful: Often persistent across multiple days or weeks.
Implementation:
Episodic memory is usually stored as a sequence of JSON messages in a database (like Redis or Postgres). When a user returns, the agent "hydrates" its current context with the relevant snippets of these previous episodes.
Use Case:
A coding assistant remembering that you refactored the auth.py file two sessions ago and applying that context to the current bug in login.py.
4. Semantic Memory: The Knowledge Base
Semantic Memory is the storage of facts, concepts, and general knowledge that is not tied to a specific date or time.
Characteristics:
- Relational: Facts are connected by meaning, not by time.
- Vast Scale: Can contain millions or billions of tokens of data (e.g., an entire corporate wiki).
- Retrieval-Based: The agent doesn't "see" all this data at once. It must "search" for it using RAG (Retrieval-Augmented Generation).
Implementation:
Typically stored in a Vector Database (like Pinecone, Chroma, or BigQuery Vector Search) where information is represented as high-dimensional embeddings.
Use Case:
An HR agent knowing that "The company holiday policy allows for 20 days off" (Semantic) vs. "User John took 5 days off last week" (Episodic).
5. Procedural Memory: The "How-To"
There is a fourth, often overlooked type of memory in the ADK: Procedural Memory. This is the memory of how to use its tools.
- Description: In the ADK, this is stored in the Tool Definitions and few-shot examples.
- Why it matters: If an agent has a tool to "Execute SQL," but it doesn't remember the schema of the database, its procedural memory is broken.
6. The "Forgetting" Problem and Consolidation
As conversations grow, the "Short-term" window gets crowded. If we just keep adding tokens, the agent becomes slow and expensive. We solve this through Consolidation.
The Consolidation Workflow:
- Summarization: Periodically, the ADK takes the oldest 20 turns and asks Gemini to "Summarize the key facts from these turns into a 3-sentence summary."
- Pruning: The raw 20 turns are deleted from the active context.
- Injection: The summary is placed at the top of the history.
Result: You go from 5,000 tokens of history down to 200 tokens, without losing the "essence" of the conversation.
7. Implementation: Simple Context "Sliding Window"
Let's look at a Python pattern for managing a simple "Short-term" window that limits history to the last 10 messages.
class AgentMemory:
def __init__(self, limit=10):
self.history = []
self.limit = limit
def add_message(self, role, text):
self.history.append({"role": role, "parts": [text]})
# THE SLIDING WINDOW LOGIC
if len(self.history) > self.limit:
print("System: History limit reached. Forgetting oldest turn...")
# We remove the oldest User/Model pair
self.history = self.history[2:]
def get_context(self):
return self.history
# Use it in your ADK agent
memory = AgentMemory(limit=10)
memory.add_message("user", "Hello!")
memory.add_message("model", "How can I help?")
# ... after many turns ...
context = memory.get_context()
8. Summary and Exercises
Memory is what turns a "Model" into a "Persistent Assistant."
- Short-term (Context Window): High-speed, ephemeral attention.
- Episodic (Session Logs): The history of interactions.
- Semantic (Vector DB): The vast library of facts.
- Consolidation is the bridge between short-term and long-term memory.
Exercises
- Memory Mapping: You are building a Personal Finance Agent. Categorize these data points:
- A: The user's name. (____)
- B: The current tax bracket for 2026. (____)
- C: The fact that the user bought coffee for $5 this morning. (____)
- D: The manual for the user's banking app. (____)
- Consolidation Design: Write a prompt for Gemini specifically designed to "Summarize a conversation for memory storage." What key information must it NOT forget?
- Scale Planning: If an agent stores 100% of every interaction in a SQL database, how many megabytes of data will a single user generate in one year of daily use? Why does this make Vector Search more attractive at scale?
In the next lesson, we will look at Storage Strategies, exploring the pros and cons of Redis, SQL, and Vector Databases for your Gemini ADK agents.