Pinecone Index Configuration

When you create an index in Pinecone, you aren't just giving it a name. You are defining the Physical and Logical Constraints of your search system. In a managed environment, many settings are hidden, but the ones you can control are critical for performance and cost.

In this lesson, we will go deep into index configuration. We will learn which distance metric to choose for which AI model, how to use Metadata Config to save on storage, and how to scale your index using the Pinecone API.

1. Choosing the Right Distance Metric

Pinecone supports three metrics: cosine, dotproduct, and euclidean. As we learned in Module 2, the choice depends entirely on your Embedding Model.

cosine (Recommended for most text models): Use this for OpenAI (text-embedding-3), Cohere, and HuggingFace. It measures the direction of the vector and is robust to varying document lengths.
dotproduct (For Maximum Performance): If your model outputs "normalized" vectors (length of 1.0), use Dot Product. It is mathematically simpler and slightly faster for the Pinecone engine to process.
euclidean: Use this for non-text vectors like image features (CLIP) or raw scientific data where the physical distance between points is meaningful.

2. Dimensionality: The Point of No Return

Every Pinecone index has a fixed dimension.

OpenAI: 1536
Cohere: 1024 or 4096
Llama 3 Embed: 4096

Engineering Warning: If you realize halfway through your project that you want to switch from a 1536D model to a 1024D model, you cannot just "adjust" the settings. You must:

Delete the index (or create a new one).
Re-embed every single document in your source database.
UPSERT millions of vectors again.

Lesson: Finalize your model choice before you start a bulk ingestion.

3. Metadata Configuration (Selective Indexing)

By default, Pinecone indexes every metadata field you provide. If you have 10 fields (e.g., author, date, category, raw_text), Pinecone builds 10 separate metadata indexes.

This is convenient but expensive. Each metadata index consumes storage and can slow down your queries.

The Solution: `metadata_config`

In your index configuration, you can specify exactly which fields you want to be "Filterable."

# Conceptual configuration during creation
pc.create_index(
    name="optimized_index",
    dimension=1536,
    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    # Only index these two fields for filtering
    # Other metadata fields will be STORED but not INDEXED
    metadata_config={
        "indexed": ["user_id", "is_published"]
    }
)

Rule of Thumb: Stop indexing large text blobs (like the actual content of a page) in your metadata. Keep those fields unindexed so they are returned in the search result, but don't waste memory trying to "filter" by them.

4. Upsert Limits and Batching

When configuring your ingestion script, you need to understand Pinecone's Upsert Pipeline.

You cannot send 1 million vectors in a single API call.
The standard batch size is 100 to 200 vectors per request.

Optimal Ingestion Pattern:

def batch_upsert(index, data, batch_size=100):
    for i in range(0, len(data), batch_size):
        batch = data[i:i + batch_size]
        index.upsert(vectors=batch)

5. Scaling Pods (The Multi-Pod Strategy)

If you are using Pod-based indices, you can scale them on-the-fly without downtime.

Scaling Horizontally (More Shards): Used when you run out of space for vectors.
Scaling Vertically (Higher Pod Type): Used when search latency is too high.
Scaling Replicas: Used when you have too many simultaneous users (high QPS).

In the Pinecone Dashboard or API, you can increase replicas instantly:

pc.configure_index("my-index", replicas=5)

6. The "Source ID" Pattern

Pinecone does not store your "Raw Files." It stores Vectors. A common configuration mistake is trying to cram 20MB of text into the Pinecone metadata.

The Best Practice: Store a source_id (like a database UUID or a S3 URL) in the metadata. When your search returns the source_id, your application fetches the full content from your primary database (SQL/NoSQL). This keeps your Pinecone index lean, fast, and cheap.

Summary and Key Takeaways

Index configuration is the bridge between your code and the cloud hardware.

Pick the Metric early: Usually cosine for text.
Dimension is Locked: Choose your embedding model carefully.
Selective Metadata Indexing is the best way to save money and improve speed.
Batch your Upserts: Aim for batches of 100 for maximum reliability.
Reference, don't Duplicate: Store IDs to your main DB in metadata, not giant text blobs.

In the next lesson, we will look at Namespaces and Metadata Filtering, the two primary ways to organize data within a single Pinecone index.

Exercise: Schema Review

You are building an "E-mail Search AI." You have 10 million emails. Metadata:

sender_email
subject
email_body (Full Text)
received_at (Timestamp)
is_spam (Boolean)

Which distance metric would you choose?
Which metadata fields would you include in metadata_config["indexed"]?
Should you store the email_body in the Pinecone metadata? What is the alternative?

Pinecone Index Configuration: Optimizing the Schema for Search

Pinecone Index Configuration

1. Choosing the Right Distance Metric

2. Dimensionality: The Point of No Return

3. Metadata Configuration (Selective Indexing)

The Solution: `metadata_config`

4. Upsert Limits and Batching

5. Scaling Pods (The Multi-Pod Strategy)

6. The "Source ID" Pattern

Summary and Key Takeaways

Exercise: Schema Review

Designing for Storage Efficiency is the mark of a Senior AI Architect.

Subscribe to our newsletter

Pinecone Index Configuration

1. Choosing the Right Distance Metric

2. Dimensionality: The Point of No Return

3. Metadata Configuration (Selective Indexing)

The Solution: metadata_config

4. Upsert Limits and Batching

5. Scaling Pods (The Multi-Pod Strategy)

6. The "Source ID" Pattern

Summary and Key Takeaways

Exercise: Schema Review

Designing for Storage Efficiency is the mark of a Senior AI Architect.

Subscribe to our newsletter

The Solution: `metadata_config`