January 5, 2026

Scaling Execution

Handling 10,000 concurrent agents.

Scaling Execution

Graphs are CPU-light but IO-heavy (waiting for LLMs). Use AsyncIO. One Python process can handle thousands of concurrent graphs if they are just awaiting OpenAI.

However, if you have local models or heavy compute nodes, you need a Task Queue (Celery/Temporal). The "Node" just puts a job on the queue and sleeps. The Worker picks it up.

Previous LessonRunning Graphs in Production

Next LessonLong-Running Workflows

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn