
Load Testing and Stress Testing
Know your limits. Learn how to use Locust to simulate thousands of users and find the breaking point of your FastAPI application.
Load Testing and Stress Testing
You've built your API, it's fast on your laptop, and your tests pass. But what happens when 10,000 users hit your site at the exact same moment during a product launch?
In this lesson, we learn how to "Attack" our own API to find its breaking point.
1. What is Load Testing?
Load testing is the process of simulating a specific amount of traffic to see how the system behaves.
- Throughput: How many requests per second (RPS) can we handle?
- Latency: How slow does it get when traffic is high?
- Stability: Does the database crash under pressure?
2. Using Locust
Locust is the best tool for FastAPI developers because it is written in Python. You write a "User Script" that describes what a typical user does, and Locust spawns thousands of those users.
Example Locust Script (locustfile.py):
from locust import HttpUser, task, between
class WebsiteUser(HttpUser):
wait_time = between(1, 5) # Wait 1-5 seconds between tasks
@task
def view_items(self):
self.client.get("/items/")
@task
def create_item(self):
self.client.post("/items/", json={"name": "test", "price": 10})
3. Finding the Bottleneck
When your API slows down, it's usually one of three things:
- Database: The most common culprit. You lack an index, or you ran out of connections.
- CPU: You are doing too much heavy processing in the request loop.
- Network: You are sending massive JSON objects that clog the pipe.
4. Scaling: Vertical vs. Horizontal
- Vertical: Buy a bigger server (More CPU/RAM). Simple, but expensive and has a limit.
- Horizontal: Add more small servers. This is how Google and Amazon scale. FastAPI is perfect for this because it is stateless—you can run 10 copies of your app behind a Load Balancer (like Nginx or AWS ALB).
Visualizing the Breaking Point
graph LR
A["1 User"] --> B["10ms Latency"]
C["100 Users"] --> D["15ms Latency"]
E["1,000 Users"] --> F["50ms Latency"]
G["10,000 Users"] --> H["ERROR: Database Timeout"]
style H fill:#f66,stroke:#333Breaking Point
Summary
- Locust: Use it to simulate "Real" user behavior.
- RPS: Track your Requests Per Second to understand your capacity.
- Bottlenecks: Don't guess; use metrics to find out what is slow.
- Scaling: Use horizontal scaling to handle infinite traffic.
In the next lesson, we wrap up Module 15 with Exercises on performance and scale.
Exercise: The Stress Test
You run a load test and find that your API handles 500 RPS perfectly, but at 600 RPS, your database CPU hits 100% and crashes.
- Is this a "Code" problem or a "Database" problem?
- What are two things you could do to fix it? (Hint: Think about Caching and Database Indexes).