Project 2: Building a RAG Knowledge Assistant

Project 2: Building a RAG Knowledge Assistant

Build a production-grade RAG system. Learn to bridge the gap between Vector Retrieval and LLM Generation.

Project 2: Building a RAG Knowledge Assistant

Semantic search is the foundation, but RAG (Retrieval-Augmented Generation) is the building. In this project, you will create a "Support Bot" that reads your private knowledge base and answers questions in conversational English.

You will focus on Context Injection and Grounding.


1. Project Requirements

  • Base: Use the semantic search engine from Project 1.
  • LLM: OpenAI GPT-4o-mini or Gemini-1.5-Flash.
  • Goal: The assistant must only answer based on the retrieved documents (Zero Hallucination).

2. The RAG Loop

  1. Retrieve: Find the top 3 relevant chunks from your Vector DB.
  2. Pack: Combine these chunks into a single "Context Block."
  3. Prompt: Send the Context + User Question to the LLM.
  4. Cite: Ensure the LLM mentions which document it is using.

3. The Implementation (Python)

def ask_assistant(question):
    # 1. RETRIEVAL
    results = collection.query(query_texts=[question], n_results=3)
    context = "\n\n".join(results['documents'][0])
    sources = ", ".join([m['source'] for m in results['metadatas'][0]])
    
    # 2. GENERATION
    system_prompt = f"""
    You are a helpful assistant. Use ONLY the following context to answer. 
    If the answer is not in the context, say 'I don't know.'
    CONTEXT: {context}
    """
    
    response = llm.chat(system_prompt, question)
    
    return f"{response}\n\nSources: {sources}"

4. Evaluation Criteria

  • Hallucination Check: If you ask a question about something NOT in your files, does the bot correctly say "I don't know"?
  • Source Attribution: Does the bot correctly list the files it used?
  • Formatting: Is the answer concise and well-structured?

Deliverables

  1. A Jupyter Notebook or Python file demonstrating the full RAG loop.
  2. 3 Test cases: (1) Fact in doc, (2) Fact NOT in doc, (3) Ambiguous query.

Good luck with your assistant! You are now building the most in-demand AI architecture in the industry.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn