
Module 8 Exercises: Advanced Autoscaling
Build an elastic cluster. Configure HPA for high demand, use VPA to rightsize your containers, and trigger a cluster-level growth.
Module 8 Exercises: Advanced Autoscaling
In Module 8, we learned how to make our infrastructure responsive to the real world. You learned how to scale Pods (HPA/VPA) and Nodes (Cluster Autoscaler). These exercises will walk you through setting up a truly elastic environment.
Exercise 1: Triggering the HPA Surge
- Deployment: Create a Deployment named
load-testwith 1 replica and a CPU request of100m. - HPA: Create an HPA for that deployment with
minReplicas: 1,maxReplicas: 10, and a target CPU of 50%. - Stress Test: Run a "Load Generator" pod (e.g.,
busyboxrunningwhile true; do wget -q -O- http://load-test; done). - Observation:
- Watch the HPA status with
kubectl get hpa -w. - How long did it take for the first new pod to be created?
- Once the new pods were running, did the "CPU %" in
kubectl get hpadrop or rise? Why?
- Watch the HPA status with
Exercise 2: Analyzing VPA Advice
- Deployment: Create a Pod that intentionally has "Stingy" resources (e.g. Request 50m CPU and 50Mi RAM) but runs a heavy workload.
- VPA: Create a VerticalPodAutoscaler with
updateMode: Off. - Observation: Wait 10 minutes. Run
kubectl describe vpa. - Action: What are the "Target" recommendations? If you were to switch the VPA to
Automode, would the pod restart immediately?
Exercise 3: Cluster "Headroom" Analysis
- Scenario: You have 3 nodes, each with 4 Cores. Your pods are currently requesting a total of 10 Cores.
- Question: If you decide to scale your Deployment to 50 replicas, each requiring
200mCPU, what will happen?- Total requested CPU = 50 * 200m = 10 Cores.
- Existing Pods = 10 Cores.
- Total = 20 Cores.
- Prediction: Which component will catch this? What will be the final state of the 50 new pods?
- Action: Describe the "Scale Up" event you would see in the Cluster Autoscaler logs.
Exercise 4: Scaling for "Zero" (Idling)
- Goal: Configure an HPA that can scale your deployment down to 0 replicas when there is no traffic.
- Investigation: By default, does
minReplicas: 0work in the standard Kubernetes HPA? (Hint: Check Lesson 1). - Solution: If the standard HPA only supports
minReplicas: 1, what external project would you use to allow "Scale to Zero"? (Hint: It starts with 'K').
Solutions (Self-Check)
Exercise 1 Answer:
- It usually takes 30-60 seconds to trigger.
- The "CPU %" will drop. As more pods are added, the average utilization of the group decreases, which is the goal of the HPA.
Exercise 2 Hint:
If you switch to Auto, the pod will Restart immediately only if the current requested resources are significantly different (usually >10%) from the recommendation.
Exercise 3 Logic:
The Cluster Autoscaler will catch the "Pending" status of the new pods. It will see that the cluster needs 8 more cores and will trigger the cloud provider to spin up 2-3 additional worker nodes.
Exercise 4 Solution:
- The standard HPA only scales down to 1.
- To scale to 0, you need KEDA (Kubernetes Event-Driven Autoscaling). KEDA is the industry standard for serverless-style scaling on K8s.
Summary of Module 8
You have built an elastic system.
- You can handle traffic surges automatically with HPA.
- You can save cloud costs and prevent OOMKills with VPA.
- You can grow the physical cluster with the Cluster Autoscaler.
- You understand the math of the scaling algorithm.
In Module 9: Logging, Monitoring, and Observability, we will learn how to build the dashboards that visualize this dynamic behavior.