
Cost Models: Pricing the Search
Master the economics of vector databases. Compare Pinecone's serverless pricing with the infra costs of self-hosting.
Cost Models: Pricing the Search
Vector storage is generally more expensive than text storage. Why? Because the HNSW index must live in RAM (Random Access Memory) to be fast. RAM costs significantly more than Disk space.
In this lesson, we break down the pricing models for the three databases we've studied.
1. Chroma Cost: "The Hardware Burden"
Chroma is free as in "Free Software." However, it is not free to run.
- The Math: If you self-host on a $20/month VPS, you are limited by that VPS's RAM (e.g., 4GB).
- The Limit: Once your index reaches ~500,000 vectors, you'll need to upgrade to a $100/month server.
- Hidden Cost: You pay for the time spent managing Docker, backups, and security.
2. Pinecone Cost: "Pay-as-you-Embed"
Pinecone uses a Serverless model (as of 2024).
- The Math:
- $X per GB of storage.
- $Y per million read/write requests.
- The Limit: It is extremely cheap for small apps (often $0 for the free tier). It becomes predictable as you grow, but you can be surprised by a "Query Storm" that spikes your bill.
3. OpenSearch Cost: "Enterprise Overhead"
OpenSearch is usually deployed as a managed cluster on AWS (Amazon OpenSearch Service).
- The Math: You pay for "Instances" (e.g., three
r6g.largenodes). - The Limit: The minimum cost is often ~$150-$300/month just to keep a basic high-availability cluster running, even if it has zero data and zero traffic.
- Hidden Benefit: As you grow to 100M+ vectors, OpenSearch can actually become cheaper than Pinecone because you aren't paying "per-request" fees.
4. Comparison Table: Cost at Scale
| Volume | Chroma (Self-host) | Pinecone (Serverless) | OpenSearch (AWS) |
|---|---|---|---|
| 10k Vectors | $5/mo | $0/mo (Free tier) | $150/mo (Min cluster) |
| 1M Vectors | $40/mo | $30/mo | $250/mo |
| 100M Vectors | $1000/mo | $2500/mo | $1500/mo |
5. Summary and Key Takeaways
- Pinecone is the winner for beginners and startups due to its low entry cost.
- OpenSearch is the winner for established enterprises with high volume and complex security needs.
- Chroma is the winner for edge devices and internal developer tools.
- RAM is the Driver: Regardless of the tool, the number of vectors in RAM is your primary cost driver.
In the next lesson, we’ll look at the Operational Complexity of these systems.