Mastering Keepalived: The Magic of Floating IPs

In the previous lesson, we learned about the theory of Virtual IPs. Now, we will build one.

The most popular, lightweight, and simple tool for creating a Failover Cluster in Linux is Keepalived. It uses the VRRP (Virtual Router Redundancy Protocol) to allow two servers to "Negotiate" who owns a specific IP address.

If Server A is active and then crashes, Server B notices within 1-2 seconds and "Broadsides" the network to say: "I am now the owner of 1.2.3.4. Send all traffic to me."

1. The VRRP Protocol Logic

Every Keepalived server has a Priority.

Server A: Priority 150 (The "Master").
Server B: Priority 100 (The "Backup").

The servers send a "Hello" packet to each other every second. As long as Server B hears from a server with a higher priority (150), it stays quiet. If the "Hello" stops, Server B announces itself.

2. Practical: Configuring the Cluster

The configuration lives in /etc/keepalived/keepalived.conf.

On the MASTER Server (192.168.1.1):

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 150
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass secret123
    }
    virtual_ipaddress {
        192.168.1.100
    }
}

On the BACKUP Server (192.168.1.2):

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    # Everything else remains identical
}

3. The Health Check Script

What if the server is still "Up" (pingable) but the web server software (Nginx) has crashed? In this case, Server A is still sending "Hello" packets, but it is useless to the users.

You can tell Keepalived to run a script to check if the app is healthy. If the script fails, Keepalived will lower its own priority and force a failover to Server B.

vrrp_script check_nginx {
    script "killall -0 nginx"
    interval 2
    weight -50
}

# Add to vrrp_instance:
track_script {
    check_nginx
}

4. Practical: Testing the Failover

How do you know it works?

Ping the Virtual IP (192.168.1.100) from your laptop.
Log into Server A and pull the network cable (or run sudo systemctl stop keepalived).
You should see 1 or 2 "Request timed out" pings, and then the pings will resume.
Run ip addr show eth0 on Server B. You will see the .100 address has appeared there!

5. Summary: Preemption Mode

By default, Keepalived uses Preemption. This means if Server A (the Master) comes back online after a crash, it will immediately "Take Back" the IP from Server B.

If your data is sensitive (like a database), you might want to disable this. You don't want the IP jumping back and forth while the database is still "Waking up." Use nopreempt.

6. Example: An IP Ownership Logger (Python)

If a failover happens while you are sleeping, you need to know about it. Here is a Python script that you can use as a "Notify" script in Keepalived. It logs the exact time and state of the server.

import sys
from datetime import datetime

def log_transition(state):
    """
    Logs the HA state change (MASTER/BACKUP/FAULT).
    """
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    with open("/var/log/ha_events.log", "a") as f:
        f.write(f"[{timestamp}] ALERT: Keepalived transition to {state}\n")

if __name__ == "__main__":
    # Keepalived passes the state as the first argument
    if len(sys.argv) > 1:
        state = sys.argv[1]
        log_transition(state)

7. Professional Tip: Use 'virtual_router_id' carefully

If you have multiple HA clusters on the same local network, they will conflict if they use the same virtual_router_id. Always make sure every pair of servers has a unique ID (between 1 and 255).

8. Summary

Keepalived is the engine of the floating IP.

VRRP is the logic that decides who is the leader.
Priority determines the hierarchy.
Track Scripts ensure the VIP follows the health of the actual application.
Notified scripts allow you to send alerts when the system changes state.
/var/log/ha_events.log is your history of stability.

In the next lesson, we move from failover to load distribution: The Traffic Cop—Mastering HAProxy.

Quiz Questions

What happens if both servers are configured as "MASTER" with the same priority?
How does the weight parameter in a track script affect the cluster?
Why is it important to use authentication (auth_pass) in your Keepalived config?

Continue to Lesson 3: Load Balancing—Mastering HAProxy.

The Floating IP: Mastering Keepalived