
The Mirror World: Syncing with lsyncd and rsync
Keep your cluster in perfect harmony. Master 'rsync' for efficient file transfers and 'lsyncd' for real-time, event-based synchronization. Learn to mirror your web assets across 10 servers in milliseconds without manual intervention.
Data Synchronization: Keeping the Cluster in Sync
If you have 5 web servers behind a Load Balancer, they all must show the exact same code. Imagine if a user refreshes the page and sees Version 1, and then refreshes again and sees Version 2!
In this lesson, we will learn how to "Mirror" our data across multiple servers. We will use the legendary rsync for manual pushes and the modern lsyncd (Live Syncing Daemon) for real-time, automated updates.
1. rsync: The Delta Transfer Master
rsync is the most efficient file transfer tool ever created for Linux. Its "Superpower" is the Delta Transfer Algorithm. If you have a 1GB file and you change only 1KB of data, rsync will only send the 1KB across the network.
Essential Flags:
-a(Archive): Keeps permissions, owners, and timestamps identical.-v(Verbose): Shows you what it's doing.-z(Compress): Compresses data during the transfer.--delete: If a file is deleted on the Source, delete it on the Destination too. (Use with caution!).
# Sync local web folder to a remote server
rsync -avz /var/www/html/ user@remote-ip:/var/www/html/
Pro Tip: Always include a trailing slash on the directory (/home/data/) if you want to sync the contents. If you leave it off (/home/data), it will sync the directory itself.
2. lsyncd: Real-Time Automation
rsync is great, but it requires a human to run it (or a Cron job). lsyncd is a daemon that watches your files using the Kernel's inotify system.
The moment you save a file on the "Master" server, lsyncd wakes up and instantly rsyncs that specific file to all your "Worker" servers.
The Config Logic (/etc/lsyncd/lsyncd.conf.lua):
settings {
logfile = "/var/log/lsyncd/lsyncd.log",
statusFile = "/var/log/lsyncd/lsyncd.status"
}
sync {
default.rsync,
source = "/var/www/html/",
target = "web-worker-01:/var/www/html/",
rsync = {
archive = true,
compress = true
}
}
3. SSH Keys: The Secret Sauce
For automated syncing to work, your Master server must be able to log into the workers without a password. You must set up SSH Key-Based Authentication (which we learned in Module 14).
# On Master: generate key
ssh-keygen
# On Master: send key to workers
ssh-copy-id user@web-worker-01
4. Practical: The "Big Sync" vs. "Small Update"
lsyncd is smart. If you create 1,000 files at once, it doesn't run rsync 1,000 times. It "Bundles" the changes and runs a single rsync pulse every few seconds (configurable with delay).
5. Identifying Sync Gaps
If a worker is "out of sync":
- Check the
lsyncdlogs:/var/log/lsyncd/lsyncd.log. - Check if the worker's hard drive is full.
- Check if the SSH keys have expired or been changed.
6. Example: A Sync Latency Tester (Python)
How fast is your mirror? Here is a Python script that creates a "Test File" on the Master, waits for it to appear on the Worker, and measures the "Sync Latency."
import time
import os
import subprocess
def test_sync_speed(target_server, remote_path):
"""
Measures the delay between a local save and a remote update.
"""
filename = f"sync_test_{int(time.time())}.txt"
local_file = f"/var/www/html/{filename}"
remote_file = f"{target_server}:{remote_path}/{filename}"
print(f"Creating local file {filename}...")
with open(local_file, "w") as f:
f.write("HEALTH_CHECK")
start_time = time.time()
print("Waiting for sync...")
while True:
# Check if remote file exists via SSH
res = subprocess.run(["ssh", target_server, f"test -f {remote_path}/{filename}"],
capture_output=True)
if res.returncode == 0:
end_time = time.time()
latency = end_time - start_time
print(f"[SUCCESS] File appeared on worker in {latency:.2f} seconds.")
break
if time.time() - start_time > 10:
print("[FAIL] Sync timed out after 10 seconds. Check lsyncd status!")
break
time.sleep(0.5)
if __name__ == "__main__":
# Example usage:
# test_sync_speed("web-worker-01", "/var/www/html")
pass
7. Professional Tip: Use 'readonly' on Workers
To prevent your cluster from getting "Corrupted" by accident, your Web Workers should treat the /var/www/html folder as Read-Only for the web application (but writeable for the specialized lsyncd user). This ensures that if a worker is hacked, the hacker can't change the code and have it "Sync Back" to the Master.
8. Summary
Synchronization is the glue that holds a cluster together.
rsyncis the efficient engine of movement.lsyncdis the event-driven brain that automates the engine.inotifyallows Linux to "Watch" for file changes with zero CPU cost.- SSH Keys are the mandatory foundation for passwordless automation.
- Sync Latency is the metric you must monitor to ensure your cluster stays in harmony.
In the next lesson, we move from files to data: Database Clusters—Intro to MySQL and Postgres Replication.
Quiz Questions
- Why is
rsyncfaster than a standardscp(Secure Copy) for large folders? - What does the
lsyncddaemon do when the network connection to a worker is temporarily lost? - What is the difference between
rsync /sourceandrsync /source/?
Continue to Lesson 5: Database Clusters—MySQL and Postgres Replication.