Handling Multi-Hop Queries at Scale: The Fan-Out Problem

Multi-hop retrieval is the "Superpower" of Graph RAG, but it is also its "Kryptonite." If every node in your graph has 10 neighbors:

1-Hop: 10 nodes.
2-Hops: 100 nodes.
3-Hops: 1,000 nodes.
4-Hops: 10,000 nodes!

This is the Fan-Out Problem. If you don't control the search, a single user question can trigger a query that reads half your database and returns a 10-million-word context block.

In this lesson, we will look at the strategies for Pruning the multi-hop walk. We will learn how to use Edge Filtering, Path Selection, and Beam Search to keep your multi-hop reasoning precise, fast, and token-efficient.

1. Edge-Type Filtering: The "Blindfolds"

You don't need to follow every edge. If the user asks about "Management," your agent should stop following [:MENTIONED_IN] or [:SIMILAR_TO] edges.

Query Optimization: MATCH (p:Person {name: 'Sudeep'})-[:MANAGES|REPORTS_TO*1..3]->(hierarchy)

By specifying the exact edge types to follow, you "Prune" the search space. You ignore 90% of the graph that isn't related to the management hierarchy.

2. Global Aggregation (Summarization Nodes)

Microsoft's "GraphRAG" implementation uses a technique called Leiden Clustering.

They group the graph into "Communities" (e.g., The "Marketing Community," The "Logistics Community").
They generate a Summary Node for each community.
When a query comes in, the system retrieves the high-level Summaries instead of the individual 10,000 nodes.

This turns a 10,000-hop problem into a 10-hop problem.

3. Beam Search: Selection of the Fittest

Instead of exploring all neighbors, Beam Search only explores the "Top K" neighbors at each level, based on a relevance score (like PageRank or Vector Similarity).

Level 1: Look at 100 neighbors. Pick the top 5.
Level 2: Look at the neighbors of those 5. Pick the top 5 again.
Result: You only explore 25 nodes total, but they are the best 25 nodes across two hops.

graph TD
    S((Start)) --> A1[0.9]
    S --> A2[0.8]
    S --> A3[0.2 - PRUNED]
    
    A1 --> B1[0.95]
    A1 --> B2[0.1 - PRUNED]
    
    A2 --> B3[0.85]
    A2 --> B4[0.4 - PRUNED]
    
    style S fill:#4285F4,color:#fff
    style A1 fill:#34A853,color:#fff
    style A2 fill:#34A853,color:#fff

4. Implementation: A Bounded Traversal in Cypher

Let's look at how to limit both height (hops) and width (neighbors).

MATCH (p:Person {id: 'Sudeep'})
// 1. LIMIT THE DEPTH (Hops)
MATCH p = (p)-[*1..2]-(other)
// 2. FILTER BY EDGE TYPE
WHERE all(r in relationships(p) WHERE type(r) IN ['WORKS_AT', 'LEADS', 'PROJECT_OF'])
// 3. LIMIT THE WIDTH (Neighbors)
WITH other, count(*) as frequency
ORDER BY frequency DESC
LIMIT 10
RETURN other.name

5. Summary and Exercises

Multi-hop scaling is about "Intelligent Ignorance."

Pruning by relationship type is the most effective way to stay fast.
Summarization collapses large communities into single manageable nodes.
Beam Search (Breadth Limiting) ensures you only follow the "Golden Paths."
Bounded Traversal (*1..2) is a mandatory safety rail.

Exercises

Pruning Scenario: You are building a bot for a "Recipe Graph." The user asks: "Is there egg in this cake?" List 3 relationship types you should follow and 3 you should IGNORE.
Breadth Math: If you use a "Beam Width" of 3 and go 3 hops deep, how many total nodes will you visit? Is this better than an unconditioned search that hits 1,000 nodes?
The "Community" Logic: Why would an "Executive Summary" node be better for a high-level question like "What happened in the company this month?" than a multi-hop walk across 1 million Slack messages?

In the next lesson, we will look at the ultimate interface: Dynamic Query Generation with LLMs.