
The Floating IP: Mastering Keepalived
Build your first failover cluster. Master 'Keepalived' and the VRRP protocol. Learn to share a single IP address between two servers. Understand priority settings, health checks, and how to trigger scripts when a failover happens.
Mastering Keepalived: The Magic of Floating IPs
In the previous lesson, we learned about the theory of Virtual IPs. Now, we will build one.
The most popular, lightweight, and simple tool for creating a Failover Cluster in Linux is Keepalived. It uses the VRRP (Virtual Router Redundancy Protocol) to allow two servers to "Negotiate" who owns a specific IP address.
If Server A is active and then crashes, Server B notices within 1-2 seconds and "Broadsides" the network to say: "I am now the owner of 1.2.3.4. Send all traffic to me."
1. The VRRP Protocol Logic
Every Keepalived server has a Priority.
- Server A: Priority 150 (The "Master").
- Server B: Priority 100 (The "Backup").
The servers send a "Hello" packet to each other every second. As long as Server B hears from a server with a higher priority (150), it stays quiet. If the "Hello" stops, Server B announces itself.
2. Practical: Configuring the Cluster
The configuration lives in /etc/keepalived/keepalived.conf.
On the MASTER Server (192.168.1.1):
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass secret123
}
virtual_ipaddress {
192.168.1.100
}
}
On the BACKUP Server (192.168.1.2):
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 100
# Everything else remains identical
}
3. The Health Check Script
What if the server is still "Up" (pingable) but the web server software (Nginx) has crashed? In this case, Server A is still sending "Hello" packets, but it is useless to the users.
You can tell Keepalived to run a script to check if the app is healthy. If the script fails, Keepalived will lower its own priority and force a failover to Server B.
vrrp_script check_nginx {
script "killall -0 nginx"
interval 2
weight -50
}
# Add to vrrp_instance:
track_script {
check_nginx
}
4. Practical: Testing the Failover
How do you know it works?
- Ping the Virtual IP (
192.168.1.100) from your laptop. - Log into Server A and pull the network cable (or run
sudo systemctl stop keepalived). - You should see 1 or 2 "Request timed out" pings, and then the pings will resume.
- Run
ip addr show eth0on Server B. You will see the.100address has appeared there!
5. Summary: Preemption Mode
By default, Keepalived uses Preemption. This means if Server A (the Master) comes back online after a crash, it will immediately "Take Back" the IP from Server B.
If your data is sensitive (like a database), you might want to disable this. You don't want the IP jumping back and forth while the database is still "Waking up." Use nopreempt.
6. Example: An IP Ownership Logger (Python)
If a failover happens while you are sleeping, you need to know about it. Here is a Python script that you can use as a "Notify" script in Keepalived. It logs the exact time and state of the server.
import sys
from datetime import datetime
def log_transition(state):
"""
Logs the HA state change (MASTER/BACKUP/FAULT).
"""
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
with open("/var/log/ha_events.log", "a") as f:
f.write(f"[{timestamp}] ALERT: Keepalived transition to {state}\n")
if __name__ == "__main__":
# Keepalived passes the state as the first argument
if len(sys.argv) > 1:
state = sys.argv[1]
log_transition(state)
7. Professional Tip: Use 'virtual_router_id' carefully
If you have multiple HA clusters on the same local network, they will conflict if they use the same virtual_router_id. Always make sure every pair of servers has a unique ID (between 1 and 255).
8. Summary
Keepalived is the engine of the floating IP.
- VRRP is the logic that decides who is the leader.
- Priority determines the hierarchy.
- Track Scripts ensure the VIP follows the health of the actual application.
- Notified scripts allow you to send alerts when the system changes state.
/var/log/ha_events.logis your history of stability.
In the next lesson, we move from failover to load distribution: The Traffic Cop—Mastering HAProxy.
Quiz Questions
- What happens if both servers are configured as "MASTER" with the same priority?
- How does the
weightparameter in a track script affect the cluster? - Why is it important to use authentication (
auth_pass) in your Keepalived config?
Continue to Lesson 3: Load Balancing—Mastering HAProxy.