Strategies for Using VPA and HPA Together

Christopher Adamson

Software Engineer, SRE at The Boeing Company

Published Mar 10, 2024

The Vertical Pod Autoscaler (VPA) and Horizontal Pod Autoscaler (HPA) provide powerful but distinct capabilities for optimizing resources and availability in Kubernetes clusters. VPA automatically adjusts CPU and memory requests and limits for pods based on observed usage. This allows more efficient resource utilization. HPA scales the number of pod replicas horizontally based on metrics like CPU to maintain availability and handle load spikes.

These two autoscaling tools can be used together but doing so requires careful configuration and coordination. The best practices are not always obvious. Missteps like overlapping control of limits or incompatible metrics can lead to poor performance or conflicts.

This tutorial delves into proven strategies for running VPA and HPA side-by-side smoothly. We will explore configurations and techniques to unlock the benefits of VPA and HPA together. For example, how to set targets and modes so they interact properly. We also discuss when it may be better to use each tool separately, such as VPA for optimizing daemonsets or HPA for scaling front-ends.

Understanding the nuances of operating VPA and HPA together or independently will help you make the most of Kubernetes' autoscaling powers. Optimizing resource usage and availability provides cost savings, performance, and resilience. This guide aims to provide the knowledge needed to dynamically scale both horizontally and vertically using the right tools for the job.

How VPA and HPA Work

VPA automatically adjusts the CPU and memory requests and limits for pods based on usage. It can save you from having to manually tune resource requests.

HPA automatically scales the number of pod replicas based on observed CPU utilization or other metrics. It helps maintain availability and handle sudden traffic spikes.

Coordinating VPA and HPA

To allow VPA and HPA to work together effectively, they need to be properly configured to avoid conflicts.

One recommended approach is to run VPA in recommend mode rather than update mode. In recommend mode, VPA will only suggest changes to resource requests in the pod spec, but not actually apply them. This allows HPA to still control the limits and make scaling decisions based on the recommended requests from VPA.

For example, VPA may monitor usage and recommend setting a pod CPU request to 500m based on observed usage patterns. This information gets passed along to HPA. If HPA has a target CPU utilization of 50%, it can use the recommended 500m request to determine the number of pod replicas needed to maintain an average of 50% CPU per pod.

When configuring the metrics and targets for HPA, be sure to use the same metric that VPA is basing its recommendations on. For CPU optimization, both should use CPU as the metric. Set an HPA target CPU utilization percentage that matches the expected usage pattern with the recommended requests from VPA, such as 50-70%.

Additionally, VPA and HPA need a consistent labeling scheme to work properly together. HPA selects pods to scale based on label selectors. VPA does the same when selecting pods to optimize. If pods managed by HPA and VPA don't match, the two systems will fight each other. Ensure pods are labeled appropriately for discovery by both VPA and HPA.

With recommend-mode VPA, appropriately configured HPA targets, and consistent pod labeling, you can run VPA and HPA together to get the benefits of resource optimization and auto-scaling in a well-coordinated manner.

Conclusion

When used properly, VPA and HPA provide complementary autoscaling capabilities that together can optimize resource usage, availability, and scalability in Kubernetes clusters. VPA excels at adjusting resource allocations vertically to meet pod needs cost-effectively. HPA scales pod counts horizontally to maintain desired metrics as load varies.

However, certain best practices should be followed when running VPA and HPA together. Run VPA in recommend mode, set HPA targets to match VPA metrics, and use consistent labels to coordinate the two systems. Take care to avoid overlaps or conflicts in how they manipulate pods.

In some cases, it also makes sense to decouple VPA and HPA. VPA can optimize resources without scaling for daemonsets or batch workloads. HPA may be preferred for rapidly scaling front-ends without needing per-pod resource tuning.

Ultimately, there is no single right way to leverage VPA and HPA. The needs of different workload types dictate where autoscaling through pod counts, resource optimization, or both is most appropriate. Use the strategies discussed here to evaluate how to best employ VPA and HPA together or independently.

With the power of Kubernetes autoscaling capabilities, you can achieve efficient resource usage, optimaling, and workload availability. Carefully coordinated VPA and HPA provide the flexible tools to make this a reality.

Strategies for Using VPA and HPA Together

Christopher Adamson

Software Engineer, SRE at The Boeing Company

How VPA and HPA Work

Coordinating VPA and HPA

Recommended by LinkedIn

Using VPA and HPA for Different Purposes

Conclusion

More articles by this author

Insights from the community

Others also viewed

Autoscaling Pitfalls to Avoid

vSAN, Dell Hyperscale and Solidigm 2PB Videos, Big Supermicro Blade Review, More...

The Best On-Set Storage, Run AI on a NAS, UGREEN Wants in on NAS, More...

Dell Servers for CSPs, AMD MI300 is Shipping, Homelab Storage Server, More...

Liquid-Cooled GPU Server Content Party!

WD Storage for AI, Leakless Liquid Cooling, More...

Networking Chips versus GPUs/CPUs

Scaling Kubernetes Pods Automatically with the Vertical Pod Autoscaler

Q&A: ASUS on AI server trends, FuriosaAI partnership and more

Dense Storage Makes CDNs Better, Dell's New Tower is Amazing, More...

Explore topics

How VPA and HPA Work

Coordinating VPA and HPA

Recommended by LinkedIn

Using VPA and HPA for Different Purposes

Conclusion

Integrating Keycloak and SOPS with Kubernetes

Nov 9, 2024

Deploying Keycloak on Kubernetes

Nov 2, 2024

Managing Encrypted Configuration with SOPS

Oct 27, 2024

Keycloak: Identity and Access Management

Oct 26, 2024

Alternatives to Kustomize for Kubernetes Configuration Management

Oct 20, 2024

The End of Kustomize in Kubectl: Transitioning to a New Workflow

Oct 19, 2024

Kubetail: Real-time Log Viewer for Kubernetes

Oct 13, 2024

vCluster: Creating Virtual Kubernetes Clusters

Oct 6, 2024

Kubernetes Dashboard: Web-based Kubernetes Management

Sep 29, 2024

Kosko: Organizing Kubernetes Manifests with TypeScript

Sep 15, 2024

Insights from the community

Others also viewed

Autoscaling Pitfalls to Avoid

vSAN, Dell Hyperscale and Solidigm 2PB Videos, Big Supermicro Blade Review, More...

The Best On-Set Storage, Run AI on a NAS, UGREEN Wants in on NAS, More...

Dell Servers for CSPs, AMD MI300 is Shipping, Homelab Storage Server, More...

Liquid-Cooled GPU Server Content Party!

WD Storage for AI, Leakless Liquid Cooling, More...

Networking Chips versus GPUs/CPUs

Scaling Kubernetes Pods Automatically with the Vertical Pod Autoscaler

Q&A: ASUS on AI server trends, FuriosaAI partnership and more

Dense Storage Makes CDNs Better, Dell's New Tower is Amazing, More...

Explore topics