Module 8 Lesson 3: Summary Memory
·LangChain

Module 8 Lesson 3: Summary Memory

Dense context. How to use an LLM to periodically summarize a conversation to keep the memory footprint small.

ConversationSummaryMemory: The Executive Assistant

As a conversation grows, a "Transcript" (Module 8 Lesson 2) becomes too long. ConversationSummaryMemory solves this by using a Second LLM call to rewrite the conversation into a short paragraph.

1. The Summarization Loop

  1. User: Sends a message.
  2. AI: Responds.
  3. Background Task: An LLM looks at the history and update the summary.
    • Example: "The user introduced himself as Sudeep and asked about steak recipes. The assistant provided a guide."

2. Using it in Python

from langchain.memory import ConversationSummaryMemory

# We need an LLM to perform the summarizing
memory = ConversationSummaryMemory(llm=model)

memory.save_context({"input": "I am traveling to Paris tomorrow."}, {"output": "How exciting!"})
memory.save_context({"input": "I need to know the weather there."}, {"output": "It will be 20 degrees."})

print(memory.load_memory_variables({}))
# Output: {'history': 'The human is traveling to Paris and asked about the weather, which is 20 degrees.'}

3. Visualizing Summary Compression

graph LR
    M1[Message 1: 50 words] --> S[Summarizer LLM]
    M2[Message 2: 50 words] --> S
    M3[Message 3: 50 words] --> S
    S --> Final[Summary: 20 words]

4. Pros and Cons

  • Pros: Memory stays the same size regardless of how long you talk. Very cost-effective for 100+ message threads.
  • Cons: You lose the exact wording. If the user says "Call me 'S-Man'", the summarizer might just write "The user asked for a nickname," losing the nickname itself.

5. Engineering Tip: Summary Buffer

For the best professional results, many developers use Summary Buffer Memory.

  • It keeps the last 5 messages as a raw transcript (for precision).
  • It summarizes everything before those 5 messages (for long-term context).

Key Takeaways

  • Summary Memory uses an LLM to compress conversation history.
  • It solves the Infinite Growth problem of Buffer memory.
  • It trade precision for efficiency.
  • It is perfect for Long-running agents and support bots.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn