
The Resource Police: Mastering Cgroups
Keep your processes in check. Master Linux 'Control Groups' (Cgroups). Learn how the kernel limits CPU usage, RAM allocation, and Disk I/O speed. Understand why a single runaway container can't crash your entire server.
Cgroups: The Resource Budget
In the previous lesson, we learned about Namespaces (Isolation). But isolation isn't enough. If you have a container that is isolated but starts a loop that uses 100% of the CPU, the other containers on the same host will starve.
To fix this, Linux uses Cgroups (Control Groups).
Cgroups are the "Police Force" of the kernel. They allow you to define a Budget for a group of processes. "This group can use 2GB of RAM and only 50% of 1 CPU core." If the group tries to use more, the kernel throttle them or kills them.
In this lesson, we will understand how Docker and Systemd use Cgroups to maintain stability.
2. The Two Versions: v1 vs v2
- Cgroups v1: The classic version. Complex, with many different folders for different resources.
- Cgroups v2: The modern standard (since 2019/2020). Unified, simpler, and more efficient. Most modern distros (Ubuntu 22.04+, RHEL 9+) use v2 by default.
3. The Hierarchy: /sys/fs/cgroup
Every process on your system belongs to a Cgroup. You can see the groups as a "Directory Tree" in the virtual filesystem /sys/fs/cgroup.
# See the groups managed by systemd
ls /sys/fs/cgroup/system.slice
4. Practical: Limiting a Process manually
Suppose you have a script heavy_task.sh and you want to ensure it never uses more than 10% of the CPU.
- Create a group:
sudo mkdir /sys/fs/cgroup/my_limit. - Set the limit (for 10% of 1 core):
echo "10000 100000" | sudo tee /sys/fs/cgroup/my_limit/cpu.max. (10,000 microseconds per 100,000 window). - Add your process to the group:
echo [PID] | sudo tee /sys/fs/cgroup/my_limit/cgroup.procs.
The kernel will now "Pause" your process every few milliseconds to ensure it stays within the 10% limit.
5. Docker Implementation: Memory Limits
When you run docker run --memory="512m" nginx, Docker isn't doing anything magical. It is simply creating a Cgroup folder and writing 536870912 to a file called memory.max.
If the Nginx container tries to use 513MB, the kernel's memory controller (as we learned in Module 17) will trigger an OOM kill specifically for that container.
6. Identification: Who is being throttled?
How do you know if your app is slow because it's poorly coded, or because it's hitting a Cgroup limit? You check the "Stat" files.
# Check if a process has been throttled (held back) by the CPU governor
cat /sys/fs/cgroup/my_limit/cpu.stat
# Look for 'nr_throttled'
7. Example: A Cgroup Resource Auditor (Python)
If you are a sysadmin, you want to know which your containers are "Hitting the Ceiling" of their allowed resources. Here is a Python script that tracks Cgroup throttling events.
import os
def check_throttling():
"""
Scans modern cgroup v2 paths for CPU throttling stats.
"""
base_path = "/sys/fs/cgroup"
print("--- Cgroup Throttling Audit ---")
# We walk through the system.slice (where services usually live)
for root, dirs, files in os.walk(base_path):
if "cpu.stat" in files:
file_path = os.path.join(root, "cpu.stat")
with open(file_path, 'r') as f:
content = f.read()
# We look for 'throttled_usec' > 0
for line in content.splitlines():
if line.startswith("throttled_usec"):
val = int(line.split()[1])
if val > 0:
folder = root.split("/")[-1]
print(f"[!] Group '{folder}' was throttled for {val/1000:.2f} ms")
if __name__ == "__main__":
check_throttling()
8. Professional Tip: Check 'Disk I/O' limits
Most people use Cgroups for CPU/RAM, but they are incredibly powerful for Disk I/O. If a single server is running 3 web apps and 1 database, you can use io.max to ensure the web apps can't "Lock up" the database's access to the hard drive.
9. Summary
Cgroups are the "Physical Reality" of Linux multitasking.
- Namespaces say "You can't see them"; Cgroups say "You can't have them."
- v2 is the modern, unified architecture.
cpu.maxandmemory.maxare the two most important limits.- Throttling is how the kernel enforces limits without killing the process immediately.
- Docker is just a user-friendly interface for writing to
/sys/fs/cgroup.
In the next lesson, we will look at how containers handle their files: The Onion FS (OverlayFS).
Quiz Questions
- What is the difference between "Throttling" a CPU and "Killing" a process for RAM usage?
- Where in the filesystem can you find the current resource settings for a running service?
- Why is Cgroup v2 considered better than v1?
Continue to Lesson 3: The Onion FS—Mastering OverlayFS and Union Filesystems.