The Graph RAG Workflow Overview: From Raw to RAG

The Graph RAG Workflow Overview: From Raw to RAG

See the big picture. Learn the end-to-end lifecycle of a Graph RAG request—from the moment a user asks a question to the final synthesis of the graph-augmented answer.

The Graph RAG Workflow Overview: From Raw to RAG

We have defined what Graph RAG is and how it compares to other systems. Now, it's time to see the Engine in Motion. To build one of these systems, you need to understand the Two Lifecycles: the Ingestion Lifecycle (where we build the graph) and the Retrieval Lifecycle (where we use the graph).

In this final lesson of Module 3, we will walk through the assembly line. We will see how a messy PDF is turned into a structured triplet and how a user's vague question is transformed into a high-precision Cypher query. This "Bird's Eye View" will be your roadmap for the rest of the course as we dive into the code for each step.


1. The Ingestion Workflow (The Construction)

The goal of ingestion is to turn "Words" into "Topology."

  1. Parsing: Extracting text from PDFs, HTML, or databases.
  2. Extraction: Using an LLM to identify Entities (Nouns) and Relationships (Verbs).
  3. Entity Resolution: Checking if "Sudeep" in Doc A is the same person as "S. Devkota" in Doc B (Module 11).
  4. Creation: Writing the nodes and edges to Neo4j/Neptune.

2. The Retrieval Workflow (The Intelligence)

The goal of retrieval is to find the "Evidence Path."

  1. Intent Classification: Identifying what the user wants (Neighborhood? Path? Global?).
  2. Query Generation: Turning the intent into a Cypher Query (Module 8).
  3. Expansion: Following the relationships from the "Seed" node to build a context subgraph.
  4. Synthesis: Feeding the subgraph to the LLM to generate the final response.

3. The "Hybrid" Bridge

In reality, these two workflows are often linked by a Vector Search.

  • The user's question is "Vectorized."
  • We find the most similar Node in the graph.
  • We then "Switch" to the Graph Retrieval workflow to follow the logical connections.
graph TD
    subgraph "Ingestion (Offline)"
    D[PDF Docs] --> E[Entities/Rels]
    E --> KG[(Knowledge Graph)]
    end
    
    subgraph "Retrieval (Online)"
    Q[User Question] --> V[Vector Entry]
    V --> KG
    KG -->|Traversal| P[Context Path]
    P --> LLM[LLM Synth]
    LLM --> A[Answer]
    end
    
    style KG fill:#4285F4,color:#fff
    style LLM fill:#34A853,color:#fff

4. Summary and Exercises

The Graph RAG workflow is a "Pipeline of Precision."

  • Ingestion is a one-time (or periodic) process to build the world model.
  • Retrieval is a real-time process to navigate that model.
  • Vector Search is the "Entrance Door" to the graph.
  • Synthesis is the final "Translation" of the graph facts back into human language.

Exercises

  1. Workflow Mapping: If you wanted to build a "Movie Recommendation Bot," which part or the workflow would handle "Finding the genre"? Which part would handle "Naming the Director"?
  2. The "Resolution" Check: Why must Entity Resolution happen During ingestion rather than during retrieval? (Hint: Think about duplicate nodes).
  3. Visualization: Draw a flow chart for a simple Graph RAG system that answers the question: "What is the capital of Japan?".

Congratulations! You have completed Module 3: What Is Graph RAG?. You now have a complete conceptual map of the entire system architecture.

In Module 4: Graph Fundamentals for AI Engineers, we will pick up our tools and start learning the "Grammar" of graphs: Nodes, Edges, and Paths.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn