
Reliability Patterns
Uptime engineering.
Reliability Patterns
The Dead Letter Queue
If a graph enters an error loop that retries failed, move it to a "Hospital" queue. A human engineer reviews: "Ah, the API key expired." Fixes it -> Resumes graph.
Monitoring
Alert on:
- High Loop Counts.
- High Token Usage.
- Low "Success" output rate.