Kubernetes Pod Autoscaling with HPA Working Example
Kubernetes Pod Autoscaling with HPA Working Example
In this article, I will show how to scale pods in Kubernetes using Horizontal Pod Autoscaler (HPA). I will consider that you have a deployment running in Kubernetes, and you want to scale it automatically based on CPU or RAM usage.
1 find a spec of deployment
For autoscaling, most important part is the resources section, where you can specify the minimum amount of CPU and RAM that your pod needs to run.
2 create HPA
here we are creating HPA for deployment mobile-api, telling that we want to scale it from 1 to 3 pods based on CPU usage. HPA will check the CPU usage and if it is more than 5%, it will scale the deployment.
3 check HPA
by current CPU usage, HPA scaled the deployment to 2 pods.
we can also make stress test to see how HPA works.
4 stress test
this will create a pod with busybox image, and we can use it to stress test our deployment.
this will make a request to our deployment every second, and we can see how HPA scales the deployment.
hpa decided to scale the deployment to 3 pods.
we can see HPA events in the events section of the dashboard.
whenever HPA saw that CPU usage is more than 5%, it scaled the deployment to 3 pods. Scaled up replica set mobile-api-78775d59fc to 3 from 2
if we stop the stress test, HPA will scale down the deployment to less pods.
if you need delete HPA, you can use this command.