Module 15 Lesson 2: Cost Monitoring
·AWS Bedrock

Module 15 Lesson 2: Cost Monitoring

Protecting the Wallet. How to track token usage and set up alerts to prevent unexpected AWS bills from your GenAI apps.

Cost Tracking: No Surprises

Generative AI is one of the only cloud services where a single user can accidentally spend $100 in 1 minute by triggering an infinite logic loop. Cost Monitoring is not a luxury; it's a survival requirement.

1. Token Metrics in CloudWatch

Bedrock automatically sends InputTokenCount and OutputTokenCount metrics to CloudWatch.

  • You can build a dashboard that shows your total token spend per hour.

2. Setting up Alarms

  1. Go to CloudWatch Alarms.
  2. Create an alarm for InputTokenCount.
  3. Set a threshold (e.g., more than 1 million tokens in 1 hour).
  4. Action: SNS Notification (Email/SMS) or Lambda to disable the Bedrock API to stop the bleeding.

3. Visualizing the Burn Rate

graph LR
    Day1[1,000 Tokens] --> Day2[1,500 Tokens]
    Day2 --> Day3[Infinite Loop Attack!]
    Day3 --> Spike[1,000,000 Tokens]
    Spike --> Alarm[ALARM TRIGGERED]
    Alarm --> Cutoff[API Disabled]

4. Per-Model Tracking

Not all tokens are equal. A Claude Opus token costs 10x more than a Haiku token. Use Cost Allocation Tags to see which model is eating your budget.


Summary

  • Token Metrics are sent to CloudWatch automatically.
  • Alarms are your primary defense against infinite loops and budget overruns.
  • Notifications ensure you are alerted before the bill arrives.
  • Budgets should be set per-environment (Development vs Production).

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn