Module 10 Lesson 4: Chunking Strategies
·AI & LLMs

Module 10 Lesson 4: Chunking Strategies

How to slice your data. Techniques for breaking large documents into AI-sized pieces without losing context.

Chunking: Preparing Your Knowledge

You cannot give a 300-page book to an embedding model all at once. If you do, the resulting vector will be "blurry"—it will represent the "average" of the book, which makes it impossible to find a specific fact on page 42.

Instead, we perform Chunking: breaking the text into small, digestible pieces.

1. The Fixed-Size Chunk

This is the simplest method. You break the text every 1,000 characters.

  • Pros: Easy to code.
  • Cons: You might cut a sentence in half, making both halves lose their meaning.

2. The Overlap Method (The Standard)

To prevent the "Cutting a fact in half" problem, we use Overlap.

  • Chunk Size: 1,000 characters.
  • Overlap: 200 characters.

The end of Chunk 1 is the same as the beginning of Chunk 2. This ensures that context "bleeds" across the splits, so the AI always sees the full thought.


3. Semantic Chunking (Advanced)

Instead of counting characters, we use the LLM to find "Logical Breaks."

  • It looks for new headers (#, ##).
  • It looks for double newlines (\n\n).
  • It looks for transitions between ideas.

This results in "high-quality" chunks where every piece contains a single, complete idea.


4. Chunking for Different Data

The strategy changes based on what you are reading:

  • Markdown: Split by Headers.
  • Python Code: Split by Class or Function.
  • Financial Reports: Split by Tables or Rows.
  • Conversations: Split by Speaker.

5. The "Context Window" Limit

Your chunk size must be smaller than your model's context window. If your chunk is 8,000 tokens and your model context is 4,000 tokens, the model will physically be unable to read the whole chunk you just searched for!

Best Practice: Keep chunks around 500 to 1,000 tokens. It fits in every model and is small enough to be very precise in search.


Key Takeaways

  • Chunking is the process of slicing long data into pieces.
  • Overlap prevents facts from being lost at the split point.
  • Smaller chunks equal higher precision search.
  • Larger chunks equal better general context but higher RAM usage.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn