Reducing Irrelevant Context

Reducing Irrelevant Context

Master techniques to strip noise and maintain high-density information for your LLM generation step.

Reducing Irrelevant Context

"Garbage in, garbage out" applies perfectly to RAG. Even if you retrieve the right document, if it's surrounded by 2,000 words of legal boilerplate or website headers, the LLM might hallucinate or fail to see the relevant details.

The Goal: High Information Density

Your objective is to provide the LLM with the maximum amount of relevant information using the minimum number of tokens.

Technique 1: Post-Retrieval Scraping

If you retrieve a chunk from a web page, don't send the full HTML. Use a library like BeautifulSoup or Trafilatura to extract only the narrative text.

Technique 2: LLM-Based Summarization

Before sending retrieved chunks to your final prompt, run a fast, cheap model (like Claude 3 Haiku) to summarize them.

summary_prompt = f"Summarize the following document focusing only on facts related to {user_query}: {retrieved_doc}"

Technique 3: Contextual Compression

This is a LangChain feature where you use an embedding model to "compress" a document by keeping only the sentences that are semantically close to the query.

from langchain.retrievers.document_compressors import EmbeddingsFilter
filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)
compressed_docs = filter.compress_documents(docs, query)

Technique 4: Removing Redundancy

If Retrieval returns 5 paragraphs that all say "The price is $50", remove 4 of them. Redundancy confuses LLMs and wastes your money.

Measuring Success

Use the Context Precision metric. This measures the signal-to-noise ratio in your retrieved chunks.

StrategyPerformance GainLatency Hit
Native ContextBaseline0ms
Header/Footer StrippingHighLow
LLM SummarizationVery HighHigh
Semantic FilteringMediumMedium

Exercises

  1. Why is "Noise" in retrieval more dangerous than having no information at all?
  2. Write a function that removes all lines from a text chunk that contain fewer than 5 words (often headers or noise).
  3. How can you use "Metadata" to decide whether to skip a retrieved document entirely?

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn