
Building Custom Retrieval Chains: The Hybrid Logic
Take full control of your retrieval logic. Learn how to build custom LangChain LCEL chains that combine vector searches, graph traversals, and Python-based filtering in a single, high-performance pipeline.
Building Custom Retrieval Chains: The Hybrid Logic
The generic GraphCypherQAChain (Lesson 2) is a black box. To build a "Production Ready" Graph RAG system, you need to open that box. You need to be able to:
- Intercept the query.
- Combine it with a Vector search.
- Filter the results with Python before they hit the LLM.
In this lesson, we will move away from pre-built chains and into LCEL (LangChain Expression Language). We will build a modular pipeline where we explicitly define the "Retrieval Step" and the "Synthesis Step." We will see how this modularity allows us to build the Hybrid Pattern (Vector Search -> Graph Walk) we designed in Module 8.
1. Defining the "Retriever" Component
In LCEL, we want to create a function that takes a "Query String" and returns "Text Evidence."
def custom_graph_retriever(query_str):
# 1. We could do a vector search here
seed_nodes = vector_db.search(query_str)
# 2. We trigger a specific Cypher expansion
facts = []
for node in seed_nodes:
graph_data = graph.query(f"MATCH (n {{id: '{node.id}'}})-[r]-(m) RETURN labels(m), m.name")
facts.append(str(graph_data))
return "\n".join(facts)
2. Assembling the LCEL Chain
LCEL uses the pipe | operator to link steps. It is like a Unix pipeline for AI.
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
# A. The Prompt
template = """Answer the question based ONLY on this graph evidence:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
# B. The Chain
chain = (
{"context": custom_graph_retriever, "question": RunnablePassthrough()}
| prompt
| llm
)
# C. EXECUTION
chain.invoke("Who are the developers on the Valkyrie project?")
3. Benefits of the Custom Approach
- Security (Injection Prevention): You can "Sanitize" the user's input in Python before it ever touches your Cypher generator.
- Cost Control: You can count tokens at each step and "Trimming" the graph results if they are too long.
- Observability: You can easily log exactly what the "Graph Retriever" found, making it much easier to debug why a specific answer was wrong.
graph LR
U[User Question] --> P[Python Pre-processor]
P --> V[Vector Entry]
V --> G[Graph Traversal]
G --> F[Fact Formatter]
F --> L[Final LLM Prompt]
L --> A[Answer]
style V fill:#f4b400,color:#fff
style G fill:#4285F4,color:#fff
style L fill:#34A853,color:#fff
4. Implementation: The "Self-Correcting" Retriever
We can add a "Retry" step inside our custom chain. If the first graph query returns 0 results, our Python function can automatically try a "Broader" query (e.g., searching for the Department instead of the Project).
5. Summary and Exercises
Custom chains transition your code from "Magic" to Control.
- LCEL allows for modular, testable retrieval steps.
- Separation of Concerns: Keep your "Graph Logic" separate from your "Answer Logic."
- Flexibility: You can now combine Graph Database results with other sources (APIs, SQL, Local Files).
Exercises
- Pipeline Draft: Write a Python function that takes a user's question and returns the "Top 3 Nodes" as a comma-separated string. Connect this as the first step in an LCEL chain.
- Debug Task: If your chain returns a "Empty Context" error, where in your LCEL pipeline should you add a
print()statement to see what's happening? - Hybrid Logic: How would you modify the
custom_graph_retrieverabove to first check a "Cache" (like Redis) before querying the Graph Database?
In the next lesson, we will look at how to give our AI a "Long-Term Graph Memory": Implementing Graph Memory in Agents.