
Cost Attribution: Who Spent the Budget?
Learn to track and attribute token costs in complex agent networks. Master the metrics of 'Cost per Task' and 'Agent ROI'.
Cost Attribution: Who Spent the Budget?
In a multi-agent system, your AWS or OpenAI bill is aggregate. You see that you spent $500 yesterday, but you don't know Which Agent spent it. Was it the creative "Writer" agent? Or was it a "Searcher" agent stuck in a loop?
Without Attribution, you cannot optimize. You might spend hours refactoring the prompt of an agent that only accounts for 1% of your costs, while ignoring the high-bandwidth "Silent Spender."
In this final lesson of Module 12, we learn how to implement Granular Cost Tracking in multi-agent graphs. We’ll build an "Attribution Dashboard" and learn the metrics for "Agent ROI."
1. Tracking by "Node" or "Node Class"
In frameworks like LangGraph, every interaction happens in a "Node." You should wrap these nodes in a Telemetry Decorator.
The Goal:
- Researcher Agent: 4,500 tokens / $0.05.
- Coder Agent: 12,000 tokens / $0.12.
- Supervisor: 500 tokens / $0.005.
2. Implementation: The Attribution Wrapper (Python)
Python Code: Tracking Usage per Node
from functools import wraps
# Global registry for cost tracking
node_costs = {}
def track_cost(node_name):
def decorator(func):
@wraps(func)
def wrapper(state):
# 1. Execute the agent logic
result = func(state)
# 2. Extract usage (Assuming tool or LLM metadata is present)
usage = result.get('usage_metadata', {})
cost = calculate_aws_cost(usage)
# 3. Attributes
node_costs[node_name] = node_costs.get(node_name, 0) + cost
return result
return wrapper
return decorator
@track_cost("code_agent")
def handle_coding(state):
# Logic...
return {"ans": "Done", "usage_metadata": {"tokens": 5000}}
3. Metric: Cost per Success (CPS)
Cost per Success is the ultimate efficiency metric.
- Total Cost: $1.00.
- Tasks Attempted: 10.
- Tasks Succeeded: 5.
- CPS: $0.20 per result.
Optimization: If a highly "Intelligent" agent has a high success rate but is very expensive, it may actually be Cheaper than a cheaper agent that fails 50% of the time (and forces a retry).
4. Visualizing the "Money Flow" (Mermaid)
You can generate diagrams to show where the "Token Blood" is flowing in your graph.
graph TD
U[User] -->|Free| S[Supervisor]
S -->|Low Cost| A[Searcher: $0.10]
S -->|High Cost| B[Architect: $0.80]
B -->|Medium Cost| C[Coder: $0.40]
style B fill:#f66,stroke-width:4px
By looking at this diagram, a Lead Engineer can immediately see: "The Architect is costing us too much. Can we summarize its input better?"
5. Token Efficiency and Agent "Sunsetting"
If an agent has a high cost and low usage (rarely called), you should consider Consolidating it. Conversely, if an agent is called 10,000 times a day (like a "Formalizer"), it is a prime candidate for Small Model Distillation or Hard-Coded Regex to eliminate the AI cost entirely.
6. Summary and Key Takeaways
- Tag Every Turn: Use decorators to attribute prompt/completion tokens to specific agents.
- Success Metrics: Factor in the "Retries" caused by cheap, low-accuracy agents.
- Heatmap Analysis: Identify the "Hottest" nodes in your graph for prompt optimization.
- Attribution is the First Step to Optimization: You can only fix what you can measure.
Exercise: The Billing Audit
- Run a 3-agent graph 10 times.
- Record the usage for each agent.
- Create a pie chart of "Spend by Agent."
- Identify the 'Cost Leader':
- Which agent spent the most?
- Is that agent performing the most important logic?
- Challenge: Try to reduce the cost leader's base prompt by 10%. Calculate the annual savings.