
Project 2: Building a RAG Knowledge Assistant
Build a production-grade RAG system. Learn to bridge the gap between Vector Retrieval and LLM Generation.
Project 2: Building a RAG Knowledge Assistant
Semantic search is the foundation, but RAG (Retrieval-Augmented Generation) is the building. In this project, you will create a "Support Bot" that reads your private knowledge base and answers questions in conversational English.
You will focus on Context Injection and Grounding.
1. Project Requirements
- Base: Use the semantic search engine from Project 1.
- LLM: OpenAI GPT-4o-mini or Gemini-1.5-Flash.
- Goal: The assistant must only answer based on the retrieved documents (Zero Hallucination).
2. The RAG Loop
- Retrieve: Find the top 3 relevant chunks from your Vector DB.
- Pack: Combine these chunks into a single "Context Block."
- Prompt: Send the Context + User Question to the LLM.
- Cite: Ensure the LLM mentions which document it is using.
3. The Implementation (Python)
def ask_assistant(question):
# 1. RETRIEVAL
results = collection.query(query_texts=[question], n_results=3)
context = "\n\n".join(results['documents'][0])
sources = ", ".join([m['source'] for m in results['metadatas'][0]])
# 2. GENERATION
system_prompt = f"""
You are a helpful assistant. Use ONLY the following context to answer.
If the answer is not in the context, say 'I don't know.'
CONTEXT: {context}
"""
response = llm.chat(system_prompt, question)
return f"{response}\n\nSources: {sources}"
4. Evaluation Criteria
- Hallucination Check: If you ask a question about something NOT in your files, does the bot correctly say "I don't know"?
- Source Attribution: Does the bot correctly list the files it used?
- Formatting: Is the answer concise and well-structured?
Deliverables
- A Jupyter Notebook or Python file demonstrating the full RAG loop.
- 3 Test cases: (1) Fact in doc, (2) Fact NOT in doc, (3) Ambiguous query.