Uvicorn and Gunicorn: The ASGI Connection

When you run fastapi dev main.py, you are using a development server. It is convenient, but it is not safe or fast enough for the real world. For production, you need an ASGI Server.

In this lesson, we learn to configure the "Engines" that power FastAPI in the cloud.

1. What is ASGI?

In the old days of Python, we used WSGI (Web Server Gateway Interface). It was designed for synchronous apps (like Django and Flask).

FastAPI uses ASGI (Asynchronous Server Gateway Interface). This is what allows FastAPI to handle thousands of concurrent WebSockets and long-running async requests.

2. Uvicorn: The Speedy Worker

Uvicorn is the most popular ASGI server. It is extremely fast because it uses uvloop (a high-performance event loop written in C).

The Production Command:

uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

--host 0.0.0.0: Makes the server accessible to the outside world.
--workers 4: Spawns 4 separate processes to handle more traffic.

3. Gunicorn: The Process Manager

While Uvicorn is great at handling requests, Gunicorn is great at managing processes. If a worker crashes, Gunicorn will automatically restart it.

The industry-standard setup for high-traffic apps is to run Gunicorn with Uvicorn workers.

gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker

4. The "One Process Per Core" Rule

For maximum performance, you should generally run one worker process per CPU core.

If your cloud server has 4 CPUs, run 4 workers.
If you have 1 CPU, run 2.

Visualizing the Production Stack

graph TD
    A["User Request (HTTPS)"] --> B["Load Balancer (Nginx/AWS)"]
    B --> C["Gunicorn (Master Process)"]
    C --> D["Uvicorn Worker 1"]
    C --> E["Uvicorn Worker 2"]
    C --> F["Uvicorn Worker 3"]
    
    D & E & F --> G["FastAPI Code"]

Summary

ASGI: The foundation of real-time Python web apps.
Uvicorn: The high-performance worker.
Gunicorn: The robust manager that handles crashes and restarts.
Optimization: Match your worker count to your CPU cores.

In the next lesson, we’ll look at Docker and Containerization, the standard way to package your app for the cloud.

Exercise: The Worker Calculation

You are deploying your API to an AWS Instance with 8 vCPUs.

How many Gunicorn workers should you ideally run?
What happens if you run 50 workers on a 1 CPU machine? (Hint: Think about "Context Switching").