concurrency

3 articles

Module 2 Lesson 4: Batching Requests

Parallel Processing. How to use .batch() to send multiple independent queries to an LLM at once.

Handling the crowd. How to manage thousands of concurrent agents without crashing your database or hit API limits.

Serving the crowd. How to configure Ollama to handle multiple concurrent user requests.