
Metrics and Dashboards: Prometheus and Grafana
Turn numbers into insights. Learn how to expose API metrics, track throughput and error rates, and build beautiful monitoring dashboards.
Metrics and Dashboards: Prometheus and Grafana
If Logs tell you "What" happened, Metrics tell you "How much" is happening. You don't read metrics; you watch them on a Dashboard.
In this lesson, we learn to track the "Vital Signs" of our FastAPI application using the industry-standard duo: Prometheus and Grafana.
1. The RED Method
When monitoring an API, you should always track three things (The RED Method):
- R (Rate): Number of requests per second.
- E (Errors): Number of failed requests (4xx and 5xx).
- D (Duration): How long each request takes (Latency).
2. Exposing Metrics in FastAPI
We use the prometheus-fastapi-instrumentator library. It automatically hooks into your app and adds a /metrics endpoint that a Prometheus server can "Scrape."
from prometheus_fastapi_instrumentator import Instrumentator
app = FastAPI()
# Hook it up!
Instrumentator().instrument(app).expose(app)
Now, if you visit localhost:8000/metrics, you will see a list of counters and timers that look like this:
http_request_duration_seconds_count{method="GET",path="/users"} 124
3. Prometheus: The Database for Numbers
Prometheus is a special type of database (Time-Series) that stores values over time. It doesn't store text; it stores numbers. It asks your API every 15 seconds: "How many requests did you handle since the last time I asked?"
4. Grafana: The Beautiful Visualization
Grafana connects to Prometheus and turns those raw numbers into graphs, gauges, and alerts.
- Alerting: You can configure Grafana to send you a Slack message if your API's error rate goes above 5% for more than 2 minutes.
Visualizing the Monitoring Pipeline
graph LR
A["FastAPI App"] -- "/metrics" --> B["Prometheus (Storage)"]
B -- "Queries" --> C["Grafana (Dashboard)"]
C -- "Alert!" --> D["PagerDuty / Slack"]
Summary
- Metrics: Quantitative data about your system.
- RED Method: Rate, Errors, Duration.
- Prometheus: The scraper and storage engine.
- Grafana: The visualization and alerting layer.
In the next lesson, we’ll look at Distributed Tracing (OpenTelemetry)—the tool that allows you to follow a single request across multiple microservices.
Exercise: The Latency Alarm
Your API normally responds in 50ms. Suddenly, it starts responding in 2000ms.
- Which metric in the RED method would show this change?
- Why is an "Average" latency often a bad metric to watch? (Hint: Research the concept of Tail Latency or p99).