Module 6 Lesson 3: Healthchecks and Restart Policies
Build a self-healing stack. Learn how to configure advanced healthchecks and restart policies directly in your Compose file to automate recovery from crashes.
Module 6 Lesson 3: Healthchecks and Restart Policies
In Module 4, we learned about docker run flags. Now, we will see how to define those same resilient behaviors in our "Infrastructure as Code" (the Compose file).
1. Restart Policies in Compose
There are 4 main options:
restart: "no": Default.restart: always: Always restart unless manually stopped.restart: on-failure: Only restart if the exit code is non-zero (a crash).restart: unless-stopped: Restart unless the container was explicitly stopped by a user.
services:
db:
image: postgres
restart: unless-stopped
2. Defining Healthchecks
While you can define a healthcheck in a Dockerfile, defining it in Compose is more flexible because you can change it for different environments (Dev vs Prod).
services:
api:
image: my-api
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost/health"]
interval: 10s
timeout: 5s
retries: 3
start_period: 20s
3. The Power of start_period
If your Java app takes 60 seconds to "Warm up" and boot, a healthcheck that starts immediately will mark it as "Unhealthy" and potentially kill it before it even finished starting!
- Always set a
start_periodlong enough for your app's slowest startup time.
4. Monitoring Health in the CLI
docker-compose ps: Shows the health status next to the runtime status.docker inspect <id>: Gives the full history of the last 5 healthcheck outputs (Great for debugging why a check is failing).
Exercise: The Self-Healing App
- Create a Compose file with an
nginxservice. - Add a healthcheck that checks for a file that doesn't exist:
test: ["CMD", "ls", "/tmp/ready"] - Set a short interval (5s) and retries (3).
- Run
docker-compose up -dand watchdocker-compose ps. - Wait for it to become
unhealthy. - Now, "Fix" it manually:
docker-compose exec nginx touch /tmp/ready. - Wait 10 seconds. Did the status change back to
healthy?
Summary
Healthchecks and Restart Policies are what make your application "Production Grade." By automating the monitoring and recovery of your services, you ensure that small bugs or temporary network glitches don't turn into major outages.
Next Lesson: YAML Pro: Extension fields and YAML anchors.