Module 10 Wrap-up: Your Private Library Bot

You have learned the theory of vectors, the mechanics of chunking, and the architecture of pipelines. Now, it’s time to build a real-world RAG system. We are going to build a Python script that reads a "Knowledge" folder and answers questions based on it.

Hands-on Exercise: The Knowledge Bot

1. Requirements

Ensure you have the necessary libraries: pip install ollama chromadb

2. The Project Setup

Create a folder named knowledge and put a few .txt files inside it (e.g., your favorite recipes, a summary of a book, or your resume).

3. The Code

Create a file named rag_bot.py:

import ollama
import chromadb

# 1. Initialize DB and Ollama
client = chromadb.Client()
collection = client.create_collection(name="docs")

# 2. Ingest your data (Simulated for this lesson)
docs = [
    "The secret ingredient in the sauce is honey and ginger.",
    "The office is closed on the first Monday of every month.",
    "The project deadline has been moved to December 15th."
]

for i, d in enumerate(docs):
  response = ollama.embeddings(model="mxbai-embed-large", prompt=d)
  collection.add(
    ids=[str(i)],
    embeddings=[response["embedding"]],
    documents=[d]
  )

# 3. The Search Function
query = "What is in the secret sauce?"
query_embed = ollama.embeddings(model="mxbai-embed-large", prompt=query)["embedding"]

results = collection.query(query_embeddings=[query_embed], n_results=1)
context = results["documents"][0][0]

# 4. The Answer
output = ollama.generate(
  model="llama3",
  prompt=f"Answer the question based ONLY on this context: {context}\n\nQuestion: {query}"
)

print(f"Question: {query}")
print(f"Answer: {output['response']}")

Module 10 Summary

RAG gives LLMs access to specific, private, or current data.
Embeddings turn text into a "Searchable Map."
Vector Databases like ChromaDB store these maps locally.
Chunking ensures the AI can find specific facts inside long documents.

Coming Up Next...

In Module 11, we move to the final frontier of customization: Fine-Tuning. We will learn when you should stop using RAG and start training the model's weights directly using Adapters and LoRA.

Module 10 Checklist

I can explain what a "Vector" is to a non-technical person.
I have downloaded the mxbai-embed-large model.
I understand why "Overlap" is important for chunking.
I have successfully run a "Similarity Search" in a Vector Store.
I can describe the 3 steps of a RAG pipeline (Retrieve, Augment, Generate).

Module 10 Wrap-up: Building Your Local Q&A System