Types of AI Memory: Short-Term, Episodic, and Semantic

A human without memory cannot solve complex problems because they forget the very context of the problem they are solving. The same applies to agents. If an agent "forgets" that the user is the CEO of a company midway through a task, its subsequent decisions might be inappropriate or incorrect.

In the Gemini ADK, memory is not a single database; it is a Hierarchical Architecture. To build agents that feel persistent and intelligent, you must understand the distinction between Short-term Context, Episodic Memory, and Semantic Knowledge. In this lesson, we will explore these three types of memory and how they are used to power agentic reasoning.

1. The Memory Hierarchy

AI memory is often compared to the human brain. We can map the technical components of Gemini to these biological concepts.

graph TD
    subgraph "Working Memory (Fast/Small)"
    A[The Context Window]
    end
    
    subgraph "Episodic Memory (The 'What Happened')"
    B[Conversation Logs]
    C[Action Traces]
    end
    
    subgraph "Semantic Memory (The 'What is True')"
    D[Vector Databases]
    E[Knowledge Graphs]
    end
    
    A <-->|Active Attention| B
    B <-->|Consolidation| D

2. Short-Term Memory: The Context Window

In the Gemini ecosystem, Short-term memory is synonymous with the Context Window. This is the data presently being processed by the model's self-attention layers.

Characteristics:

Instant Recall: The model sees $100%$ of the information perfectly.
High Transience: Once the inference call is finished, this memory disappears unless saved externally.
Physical Limits: Even with Gemini's 2-million token window, it is still a "bucket" that can eventually overflow.

Use Case:

Storing the immediate details of the current user conversation, recent tool outputs, and the specific plan the agent is currently executing.

3. Episodic Memory: The Personal Narrative

Episodic Memory is the record of "Episodes" or "Sessions." It answers the question: "What did the user and I do together yesterday?"

Characteristics:

Time-Bound: Organized chronologically.
Sequential: Preserves the cause-and-effect relationship of actions.
Stateful: Often persistent across multiple days or weeks.

Implementation:

Episodic memory is usually stored as a sequence of JSON messages in a database (like Redis or Postgres). When a user returns, the agent "hydrates" its current context with the relevant snippets of these previous episodes.

Use Case:

A coding assistant remembering that you refactored the auth.py file two sessions ago and applying that context to the current bug in login.py.

4. Semantic Memory: The Knowledge Base

Semantic Memory is the storage of facts, concepts, and general knowledge that is not tied to a specific date or time.

Characteristics:

Relational: Facts are connected by meaning, not by time.
Vast Scale: Can contain millions or billions of tokens of data (e.g., an entire corporate wiki).
Retrieval-Based: The agent doesn't "see" all this data at once. It must "search" for it using RAG (Retrieval-Augmented Generation).

Implementation:

Typically stored in a Vector Database (like Pinecone, Chroma, or BigQuery Vector Search) where information is represented as high-dimensional embeddings.

Use Case:

An HR agent knowing that "The company holiday policy allows for 20 days off" (Semantic) vs. "User John took 5 days off last week" (Episodic).

5. Procedural Memory: The "How-To"

There is a fourth, often overlooked type of memory in the ADK: Procedural Memory. This is the memory of how to use its tools.

Description: In the ADK, this is stored in the Tool Definitions and few-shot examples.
Why it matters: If an agent has a tool to "Execute SQL," but it doesn't remember the schema of the database, its procedural memory is broken.

6. The "Forgetting" Problem and Consolidation

As conversations grow, the "Short-term" window gets crowded. If we just keep adding tokens, the agent becomes slow and expensive. We solve this through Consolidation.

The Consolidation Workflow:

Summarization: Periodically, the ADK takes the oldest 20 turns and asks Gemini to "Summarize the key facts from these turns into a 3-sentence summary."
Pruning: The raw 20 turns are deleted from the active context.
Injection: The summary is placed at the top of the history.

Result: You go from 5,000 tokens of history down to 200 tokens, without losing the "essence" of the conversation.

7. Implementation: Simple Context "Sliding Window"

Let's look at a Python pattern for managing a simple "Short-term" window that limits history to the last 10 messages.

class AgentMemory:
    def __init__(self, limit=10):
        self.history = []
        self.limit = limit

    def add_message(self, role, text):
        self.history.append({"role": role, "parts": [text]})
        # THE SLIDING WINDOW LOGIC
        if len(self.history) > self.limit:
            print("System: History limit reached. Forgetting oldest turn...")
            # We remove the oldest User/Model pair
            self.history = self.history[2:]

    def get_context(self):
        return self.history

# Use it in your ADK agent
memory = AgentMemory(limit=10)
memory.add_message("user", "Hello!")
memory.add_message("model", "How can I help?")
# ... after many turns ...
context = memory.get_context()

8. Summary and Exercises

Memory is what turns a "Model" into a "Persistent Assistant."

Short-term (Context Window): High-speed, ephemeral attention.
Episodic (Session Logs): The history of interactions.
Semantic (Vector DB): The vast library of facts.
Consolidation is the bridge between short-term and long-term memory.

Exercises

Memory Mapping: You are building a Personal Finance Agent. Categorize these data points:
- A: The user's name. (____)
- B: The current tax bracket for 2026. (____)
- C: The fact that the user bought coffee for $5 this morning. (____)
- D: The manual for the user's banking app. (____)
Consolidation Design: Write a prompt for Gemini specifically designed to "Summarize a conversation for memory storage." What key information must it NOT forget?
Scale Planning: If an agent stores 100% of every interaction in a SQL database, how many megabytes of data will a single user generate in one year of daily use? Why does this make Vector Search more attractive at scale?

In the next lesson, we will look at Storage Strategies, exploring the pros and cons of Redis, SQL, and Vector Databases for your Gemini ADK agents.

Types of AI Memory: Short-Term, Episodic, and Semantic

1. The Memory Hierarchy

2. Short-Term Memory: The Context Window

Characteristics:

Use Case:

3. Episodic Memory: The Personal Narrative

Characteristics:

Implementation:

Use Case:

4. Semantic Memory: The Knowledge Base

Characteristics:

Implementation:

Use Case:

5. Procedural Memory: The "How-To"

6. The "Forgetting" Problem and Consolidation

The Consolidation Workflow:

7. Implementation: Simple Context "Sliding Window"

8. Summary and Exercises

Exercises

Subscribe to our newsletter