Index Versioning: Managing Change

Index Versioning: Managing Change

Learn how to manage multiple versions of your vector index. Master the Blue-Green deployment strategy for vector data.

Index Versioning: Managing Change

In a traditional database, you can add a column while the system is running. In a Vector Database, if you change your Chunking Strategy or your Embedding Model, you cannot just "Update" the vectors. You have to rebuild the entire index from scratch. This is called a Schema Migration.

In this lesson, we learn how to manage multiple versions of our index without downtime.


1. The "Vector Version" Conflict

Vectors are tied to the model that created them.

  • Vector_A (Text-Embedding-Ada-002) is a 1536-dim point.
  • Vector_B (Text-Embedding-3-Small) is also 1536-dim, but the coordinates mean something entirely different.

If you search an Ada-002 index using a 3-Small query vector, you will get random, meaningless results. You must version your indexes.


2. Blue-Green Index Deployment

This is the industry standard for safe vector migrations:

  1. Blue Index (v1): The current production index serving users.
  2. Green Index (v2): You create a brand new index. You re-ingest all documents using the new model/chunking rules.
  3. Validation: You run your test suite (Module 11) against the Green Index.
  4. The Flip: You point your backend code (the API) to the Green Index.
  5. Teardown: You delete the Blue Index once you are sure Green is stable.
graph TD
    API[Backend API] --v1--> Blue[Index: docs-2023-v1]
    API --". . . switch . . ."--> Green[Index: docs-2024-v2]
    note right of Green: Tested and Verified

3. Implementation: Naming Conventions (Python)

Never call your index "production." Use a name that includes the model and a version number.

# GOOD:
INDEX_NAME = "kb-openai-v3-small-2024-05-v1"

# In your FastAPI config:
CURRENT_INDEX = os.getenv("VECTOR_INDEX_VERSION", "kb-openai-v3-small-2024-05-v1")

# To upgrade, you just change one environment variable!

4. Aliases: The Professional Layer

Advanced vector databases (like OpenSearch) support Aliases. You point your code to a fixed name like current_docs, and inside the database, you can point that alias to any underlying index version. This allows for instant "Flips" with zero code changes.


5. Summary and Key Takeaways

  1. Models are Locked: You can't mix models in a single index.
  2. Blue-Green is Safest: Always build a new index for a major change.
  3. Audit your Naming: Include the model and timestamp in your index names.
  4. Rollback Priority: Keep the old "Blue" index alive for at least 24 hours after the flip in case of errors.

In the next lesson, we’ll look at the deployment mechanic: Rolling Updates.


Congratulations on completing Module 17 Lesson 3! You are now move data safely.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn