
The Price of Intelligence: AI Cost Drivers
Master the bill. Learn how tokens, instances, and storage define your AWS AI budget and how to optimize for every dollar.
Budgeting for the Brain
AI is one of the most expensive workloads in the cloud. It’s not just "Storage and Compute"; it involves high-performance GPUs, massive datasets, and complex API pricing.
On the AWS Certified AI Practitioner exam, you must be able to identify the primary "Cost Drivers" for a scenario. If a company's bill is 10x higher than expected, you should know exactly which knob to turn.
1. GenAI Cost Driver: Tokens (Amazon Bedrock)
As we learned in Module 3, GenAI doesn't charge "Per Query"; it charges "Per Token."
- Input Tokens: The prompt you send to the AI.
- Output Tokens: The text the AI writes back.
The Strategic Logic:
- If you ask an AI to "Summarize this 100-page book in one sentence," you are paying for 100,000 input tokens but only 20 output tokens.
- Cost Tip: Usually, Output Tokens are significantly more expensive than Input Tokens.
2. ML Cost Driver: Instances (Amazon SageMaker)
If you are using SageMaker, you aren't paying for tokens; you are paying for "Up-Time" on virtual servers (EC2 instances).
- Training: Charged by the Type of GPU (e.g., P4d instances are very expensive) and the Duration of the training job.
- Hosting: Charged by having an endpoint Always On. Even if no one is using the AI at 3 AM, you are still paying for the server.
The Optimization: Use SageMaker Serverless Inference for low-traffic apps to avoid paying for idle time.
3. The "Silent" Cost Driver: Data and Storage
Moving and storing your data costs money.
- S3 Storage: Storing trillions of images or text files.
- Data Transfer: Moving data out of an AWS region or between different AWS services.
- KMS: Charging for the management and use of encryption keys.
4. Visualizing the Cost Stack
| Driver | Service | Pricing Unit |
|---|---|---|
| Tokens | Amazon Bedrock | Per 1,000 - 1,000,000 tokens |
| Compute | Amazon SageMaker | Per Instance-hour |
| API Call | Amazon Rekognition | Per Image / Per minute of video |
| Storage | Amazon S3 | Per GB per month |
graph TD
A[Total AI Bill] --> B[Bedrock: Usage-Based]
A --> C[SageMaker: Time-Based]
A --> D[S3: Volume-Based]
B --> B1[Input Tokens]
B --> B2[Output Tokens]
C --> C1[Training Hours]
C --> C2[Hosting Hours]
5. Summary: Financial Awareness
As a Practitioner, you should always ask three questions before launching:
- Is this model "Right-Sized"? (Using a smaller model saves money).
- Can we use Spot Instances for training? (Can save up to 90%).
- Should we use Serverless or Provisioned?
Exercise: Identify the Cost Driver
A company is using Amazon Bedrock to translate 1 million customer support tickets. Each ticket is roughly 500 words. They found their bill was $5,000 this month. What is the primary "Unit of measure" that defined this cost?
- A. Server Up-time (hours).
- B. Number of GPU chips used.
- C. Tokens processed.
- D. GB of data stored in S3.
The Answer is C! Amazon Bedrock is a serverless, token-based service. The "Volume" of the text (500 words per ticket) is the main driver of the cost.
Knowledge Check
?Knowledge Check
How is Amazon Bedrock primarily priced for foundation model usage?
What's Next?
Cheaper isn't always better. In the next lesson, we see the "Performance" side of the coin. Find out in Lesson 2: Performance Trade-offs (Latency vs. Accuracy).