Module 6 Lesson 3: Introduction to Vector Stores
The Semantic Database. How to store thousands of vectors so you can search them in milliseconds.
Vector Stores: The Brain's Warehouse
A Vector Store is a specialized database that stores your text chunks AND their corresponding embedding vectors. Unlike a normal database where you search for "Exact Words," a Vector Store searches for "Similar Meanings."
1. How it Works (The Index)
When you "Add" documents to a vector store, it creates an Index. An index is a mathematical shortcut that allows the database to skip 99.9% of the data and jump straight to the pieces of text that are most relevant to your query.
2. Top Vector Stores for Developers
- ChromaDB: The standard for local Python development. Very fast and easy to set up.
- FAISS: Facebook's high-performance library. Great for raw speed but can be harder to manage.
- Pinecone: The "Cloud" choice. Managed, scalable, and powerful for production.
3. Basic Code Example (using FAISS)
Install the library: pip install faiss-cpu
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
# 1. Setup
embeddings = OpenAIEmbeddings()
texts = ["The sun is hot.", "The ice is cold."]
# 2. CREATE AND STORE
db = FAISS.from_texts(texts, embeddings)
# 3. SEARCH
results = db.similarity_search("Which one is freezing?")
print(results[0].page_content)
# Output: "The ice is cold."
4. Visualizing Search Logic
graph TD
User[Query: 'Freezing'] --> E[Embedding Model]
E --> Vec[Query Vector]
Vec --> VS[Vector Store Index]
VS --> Match[Result: 'Cold']
Match --> Final[Return Text to User]
5. Persistence: Saving the DB
Local vector stores like FAISS or Chroma can be saved to your hard drive so you don't have to re-embed everything every time you restart your script.
# Save
db.save_local("faiss_index")
# Load back later
new_db = FAISS.load_local("faiss_index", embeddings)
Key Takeaways
- Vector Stores enable high-speed semantic search.
- Index is the math structure that makes search fast.
- Chroma/FAISS are the best choices for starting out locally.
- Persistence allows you to "Carry" your AI's memory across sessions.