
Handling Multi-Hop Queries at Scale: The Fan-Out Problem
Master the performance challenges of deep graph traversal. Learn how to prevent the 'Exponential Explosion' of context and how to use pruning techniques to keep multi-hop reasoning fast and efficient.
Handling Multi-Hop Queries at Scale: The Fan-Out Problem
Multi-hop retrieval is the "Superpower" of Graph RAG, but it is also its "Kryptonite." If every node in your graph has 10 neighbors:
- 1-Hop: 10 nodes.
- 2-Hops: 100 nodes.
- 3-Hops: 1,000 nodes.
- 4-Hops: 10,000 nodes!
This is the Fan-Out Problem. If you don't control the search, a single user question can trigger a query that reads half your database and returns a 10-million-word context block.
In this lesson, we will look at the strategies for Pruning the multi-hop walk. We will learn how to use Edge Filtering, Path Selection, and Beam Search to keep your multi-hop reasoning precise, fast, and token-efficient.
1. Edge-Type Filtering: The "Blindfolds"
You don't need to follow every edge. If the user asks about "Management," your agent should stop following [:MENTIONED_IN] or [:SIMILAR_TO] edges.
Query Optimization:
MATCH (p:Person {name: 'Sudeep'})-[:MANAGES|REPORTS_TO*1..3]->(hierarchy)
- By specifying the exact edge types to follow, you "Prune" the search space. You ignore 90% of the graph that isn't related to the management hierarchy.
2. Global Aggregation (Summarization Nodes)
Microsoft's "GraphRAG" implementation uses a technique called Leiden Clustering.
- They group the graph into "Communities" (e.g., The "Marketing Community," The "Logistics Community").
- They generate a Summary Node for each community.
- When a query comes in, the system retrieves the high-level Summaries instead of the individual 10,000 nodes.
This turns a 10,000-hop problem into a 10-hop problem.
3. Beam Search: Selection of the Fittest
Instead of exploring all neighbors, Beam Search only explores the "Top K" neighbors at each level, based on a relevance score (like PageRank or Vector Similarity).
- Level 1: Look at 100 neighbors. Pick the top 5.
- Level 2: Look at the neighbors of those 5. Pick the top 5 again.
- Result: You only explore 25 nodes total, but they are the best 25 nodes across two hops.
graph TD
S((Start)) --> A1[0.9]
S --> A2[0.8]
S --> A3[0.2 - PRUNED]
A1 --> B1[0.95]
A1 --> B2[0.1 - PRUNED]
A2 --> B3[0.85]
A2 --> B4[0.4 - PRUNED]
style S fill:#4285F4,color:#fff
style A1 fill:#34A853,color:#fff
style A2 fill:#34A853,color:#fff
4. Implementation: A Bounded Traversal in Cypher
Let's look at how to limit both height (hops) and width (neighbors).
MATCH (p:Person {id: 'Sudeep'})
// 1. LIMIT THE DEPTH (Hops)
MATCH p = (p)-[*1..2]-(other)
// 2. FILTER BY EDGE TYPE
WHERE all(r in relationships(p) WHERE type(r) IN ['WORKS_AT', 'LEADS', 'PROJECT_OF'])
// 3. LIMIT THE WIDTH (Neighbors)
WITH other, count(*) as frequency
ORDER BY frequency DESC
LIMIT 10
RETURN other.name
5. Summary and Exercises
Multi-hop scaling is about "Intelligent Ignorance."
- Pruning by relationship type is the most effective way to stay fast.
- Summarization collapses large communities into single manageable nodes.
- Beam Search (Breadth Limiting) ensures you only follow the "Golden Paths."
- Bounded Traversal (
*1..2) is a mandatory safety rail.
Exercises
- Pruning Scenario: You are building a bot for a "Recipe Graph." The user asks: "Is there egg in this cake?" List 3 relationship types you should follow and 3 you should IGNORE.
- Breadth Math: If you use a "Beam Width" of 3 and go 3 hops deep, how many total nodes will you visit? Is this better than an unconditioned search that hits 1,000 nodes?
- The "Community" Logic: Why would an "Executive Summary" node be better for a high-level question like "What happened in the company this month?" than a multi-hop walk across 1 million Slack messages?
In the next lesson, we will look at the ultimate interface: Dynamic Query Generation with LLMs.