Module 6 Lesson 3: Healthchecks and Restart Policies
·DevOps

Module 6 Lesson 3: Healthchecks and Restart Policies

Build a self-healing stack. Learn how to configure advanced healthchecks and restart policies directly in your Compose file to automate recovery from crashes.

Module 6 Lesson 3: Healthchecks and Restart Policies

In Module 4, we learned about docker run flags. Now, we will see how to define those same resilient behaviors in our "Infrastructure as Code" (the Compose file).

1. Restart Policies in Compose

There are 4 main options:

  • restart: "no": Default.
  • restart: always: Always restart unless manually stopped.
  • restart: on-failure: Only restart if the exit code is non-zero (a crash).
  • restart: unless-stopped: Restart unless the container was explicitly stopped by a user.
services:
  db:
    image: postgres
    restart: unless-stopped

2. Defining Healthchecks

While you can define a healthcheck in a Dockerfile, defining it in Compose is more flexible because you can change it for different environments (Dev vs Prod).

services:
  api:
    image: my-api
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost/health"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 20s

3. The Power of start_period

If your Java app takes 60 seconds to "Warm up" and boot, a healthcheck that starts immediately will mark it as "Unhealthy" and potentially kill it before it even finished starting!

  • Always set a start_period long enough for your app's slowest startup time.

4. Monitoring Health in the CLI

  • docker-compose ps: Shows the health status next to the runtime status.
  • docker inspect <id>: Gives the full history of the last 5 healthcheck outputs (Great for debugging why a check is failing).

Exercise: The Self-Healing App

  1. Create a Compose file with an nginx service.
  2. Add a healthcheck that checks for a file that doesn't exist: test: ["CMD", "ls", "/tmp/ready"]
  3. Set a short interval (5s) and retries (3).
  4. Run docker-compose up -d and watch docker-compose ps.
  5. Wait for it to become unhealthy.
  6. Now, "Fix" it manually: docker-compose exec nginx touch /tmp/ready.
  7. Wait 10 seconds. Did the status change back to healthy?

Summary

Healthchecks and Restart Policies are what make your application "Production Grade." By automating the monitoring and recovery of your services, you ensure that small bugs or temporary network glitches don't turn into major outages.

Next Lesson: YAML Pro: Extension fields and YAML anchors.

Subscribe to our newsletter

Get the latest posts delivered right to your inbox.

Subscribe on LinkedIn