Module 2 Lesson 4: Batching Requests
Parallel Processing. How to use .batch() to send multiple independent queries to an LLM at once.
Batching: Mass-Producing Answers
If you have 100 customer reviews and you want to summarize all of them, running a loop with .invoke() will take a long time because each request has to wait for the previous one to finish. Batching allows you to send all of them in parallel.
1. Sequential vs. Parallel
- Sequential: 1 + 1 + 1 + 1 (10 seconds total).
- Batch (Parallel): [1, 1, 1, 1] ( ~3 seconds total).
LangChain's .batch() uses asynchronous threading under the hood to maximize your network bandwidth.
2. Using .batch()
The .batch() method takes a list of inputs and returns a list of outputs.
queries = [
"What is 1+1?",
"What is 2+2?",
"What is 3+3?"
]
# Send all three at once
responses = model.batch(queries)
for res in responses:
print(res.content)
3. Rate Limit Warnings
While batching is fast, it's also the easiest way to get your API key blocked.
- If you send 50 batches of 10 requests each (500 requests), OpenAI might hit you with a
429: Too Many Requestserror. - Solution: Use the
max_concurrencyparameter to throttle the speed.
# Limit to only 3 parallel calls at a time to be safe
responses = model.batch(queries, config={"max_concurrency": 3})
4. Error Handling in Batches
If one request in a batch of 10 fails, LangChain tries to handle it gracefully, but you should still wrap your batch call in a try/except block or use Checkpoints for very large batches (Module 7).
5. Visualizing the Throughput
graph TD
Data[100 Text Chunks] --> Split[Split into 10 Batches]
Split --> Batch1[Async Call]
Split --> Batch2[Async Call]
Split --> Batch3[Async Call]
Batch1 --> Consolidation[Final Result List]
Batch2 --> Consolidation
Batch3 --> Consolidation
Key Takeaways
.batch()enables parallel processing of multiple inputs.- It significantly reduces total clock time for large datasets.
max_concurrencyis required to avoid being blocked by API providers.- Batching is ideal for Data Pipelines and Offline Analysis.