Horizontal Pod Autoscaler (HPA) Demo¶
Now that you have an understanding of how HPA works, let's see it in action.
Docker Images¶
Here is the Docker Image used in this tutorial: reyanshkharga/nodeapp:v1
Note
reyanshkharga/nodeapp:v1 runs on port 5000 and has the following routes:
GET /Returns host info and app versionGET /healthReturns health status of the appGET /randomReturns a randomly generated number between 1 and 10
Objective¶
We'll follow these steps to test the Horizontal Pod Autoscaler (HPA):
- We'll create a
Deploymentand aServiceobject. - We'll create
HorizontalPodAutoscalerobject for the deployment. - We'll generate load on pods managed by the deployment.
- We'll observe HPA taking autoscaling actions to meet the increased demand.
Let's see this in action!
Step 1: Create a Deployment¶
First, let's create a deployment as follows:
Apply the manifest to create the deployment:
Verify deployment and pods:
Note that each pod can consume a maximum of 100m CPU and 128Mi memory.
Step 2: Create a Service¶
Next, let's create a LoadBalancer service as follows:
Apply the manifest to create the service:
Verify service:
Step 3: Create HPA for the Deployment¶
Now, let's create a HPA for the deployment as follows:
Apply the manifest to create the HPA:
Verify HPA:
Step 4: Generate Load¶
Let's generate load on the pods managed by the deployment. On your local machine run the following command to generate the load:
The above command concurrently sends 1000 requests per second to the LoadBalancer service using 100 parallel processes.
Step 5: Monitor Pods and HPA Events¶
# List pods in watch mode
kubectl get pods -w
# List hpa in watch mode
kubectl get hpa -w
# View hpa events
kubectl describe hpa my-hpa
You'll notice that as soon as any of the defined threshold is crossed the hpa scales the number of replicas to ensure the resource utilization is within the defined threshold.
Here's a sample event from the hpa:
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 5s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Clean Up¶
Assuming your folder structure looks like the one below:
Let's delete all the resources we created: