Budgeting for the Brain

AI is one of the most expensive workloads in the cloud. It’s not just "Storage and Compute"; it involves high-performance GPUs, massive datasets, and complex API pricing.

On the AWS Certified AI Practitioner exam, you must be able to identify the primary "Cost Drivers" for a scenario. If a company's bill is 10x higher than expected, you should know exactly which knob to turn.

1. GenAI Cost Driver: Tokens (Amazon Bedrock)

As we learned in Module 3, GenAI doesn't charge "Per Query"; it charges "Per Token."

Input Tokens: The prompt you send to the AI.
Output Tokens: The text the AI writes back.

The Strategic Logic:

If you ask an AI to "Summarize this 100-page book in one sentence," you are paying for 100,000 input tokens but only 20 output tokens.
Cost Tip: Usually, Output Tokens are significantly more expensive than Input Tokens.

2. ML Cost Driver: Instances (Amazon SageMaker)

If you are using SageMaker, you aren't paying for tokens; you are paying for "Up-Time" on virtual servers (EC2 instances).

Training: Charged by the Type of GPU (e.g., P4d instances are very expensive) and the Duration of the training job.
Hosting: Charged by having an endpoint Always On. Even if no one is using the AI at 3 AM, you are still paying for the server.

The Optimization: Use SageMaker Serverless Inference for low-traffic apps to avoid paying for idle time.

3. The "Silent" Cost Driver: Data and Storage

Moving and storing your data costs money.

S3 Storage: Storing trillions of images or text files.
Data Transfer: Moving data out of an AWS region or between different AWS services.
KMS: Charging for the management and use of encryption keys.

4. Visualizing the Cost Stack

Driver	Service	Pricing Unit
Tokens	Amazon Bedrock	Per 1,000 - 1,000,000 tokens
Compute	Amazon SageMaker	Per Instance-hour
API Call	Amazon Rekognition	Per Image / Per minute of video
Storage	Amazon S3	Per GB per month

graph TD
    A[Total AI Bill] --> B[Bedrock: Usage-Based]
    A --> C[SageMaker: Time-Based]
    A --> D[S3: Volume-Based]
    
    B --> B1[Input Tokens]
    B --> B2[Output Tokens]
    
    C --> C1[Training Hours]
    C --> C2[Hosting Hours]

5. Summary: Financial Awareness

As a Practitioner, you should always ask three questions before launching:

Is this model "Right-Sized"? (Using a smaller model saves money).
Can we use Spot Instances for training? (Can save up to 90%).
Should we use Serverless or Provisioned?

Exercise: Identify the Cost Driver

A company is using Amazon Bedrock to translate 1 million customer support tickets. Each ticket is roughly 500 words. They found their bill was $5,000 this month. What is the primary "Unit of measure" that defined this cost?

A. Server Up-time (hours).
B. Number of GPU chips used.
C. Tokens processed.
D. GB of data stored in S3.

The Answer is C! Amazon Bedrock is a serverless, token-based service. The "Volume" of the text (500 words per ticket) is the main driver of the cost.

Knowledge Check

Error: Quiz options are missing or invalid.

What's Next?

Cheaper isn't always better. In the next lesson, we see the "Performance" side of the coin. Find out in Lesson 2: Performance Trade-offs (Latency vs. Accuracy).

The Price of Intelligence: AI Cost Drivers