Economics of the Prompt: Saving Money

Every character you send to Bedrock and every word it generates costs you money. In an enterprise app with millions of users, "Prompt Bloat" can cost thousands of dollars a month.

1. The Token Counter

Bedrock models don't see words; they see Tokens. 1,000 tokens is roughly 750 words.

You are billed for (Input Tokens) + (Output Tokens).

2. Strategies for Efficiency

Condense Your Context: Don't send a whole 50-page PDF if you only need page 3.
Be Concise: Instead of "Please kindly write a short summary of about two paragraphs," use "Summarize in 2 paragraphs." (Saved 8 tokens).
Stop Sequences: Use stopSequences to tell the model exactly when to finish. This prevents it from "rambling" on and wasting output tokens.

3. Visualizing Token Flow

graph LR
    P[Giant 5,000 Token Prompt] --> Model[AI Brain]
    Model --> Out[50 Token Answer]
    
    Total[5,050 Tokens Paid]
    
    P_Opt[Small 500 Token Prompt] --> Model
    Model --> Out
    
    Total_Opt[550 Tokens Paid: 90% Savings!]

4. Temperature and Reproducibility

Temperature: High (1.0) = Creative/Random; Low (0.0) = Precise/Boring.
For business apps (Data extraction, summarizing), use Temperature: 0. This ensures the model gives the same answer every time and doesn't "explore" expensive hallucinations.

Summary

Input Tokens (your prompt) cost money too.
Conciseness is an engineering skill, not just a writing style.
stopSequences prevent expensive rambling.
Temperature: 0 is the standard for stable, cost-aware business logic.

Module 4 Lesson 2: Cost-Aware Prompting