Scaling

Overview

AKS (Azure Kubernetes Service) provides flexible options for scaling the nodes in your Kubernetes cluster. Node scaling allows you to adjust the number of virtual machine nodes in your cluster to meet the changing demands of your applications. Here are the key aspects of AKS node scaling:

Manual Scaling:

With AKS, you can manually scale the number of nodes in your cluster based on your application needs.
You can use the Azure portal, Azure CLI, or Azure PowerShell to increase or decrease the node count.
Manual scaling is useful when you want to quickly adjust the resources allocated to your cluster.

Cluster Autoscaler:

AKS integrates with the Kubernetes Cluster Autoscaler, which automatically adjusts the number of nodes in your cluster based on the pending workload.
The Cluster Autoscaler monitors the resource usage of your cluster and scales up or down as needed.
This feature ensures optimal utilization of resources and helps prevent resource shortages or over-provisioning.

Horizontal Pod Autoscaler (HPA):

AKS supports the Kubernetes Horizontal Pod Autoscaler, which automatically scales the number of pods based on CPU utilization or custom metrics.
When the workload increases, the HPA increases the number of pods, which in turn triggers the Cluster Autoscaler to add more nodes if necessary.
This combination of HPA and Cluster Autoscaler provides dynamic scaling of both pods and nodes based on application demands.

By leveraging these scaling options in AKS, you can ensure that your Kubernetes cluster can handle the changing workload requirements effectively. Whether you choose to manually scale, use the Cluster Autoscaler, leverage the Horizontal Pod Autoscaler, or explore the Virtual Node feature, AKS provides the flexibility to adjust resources and optimize the performance and cost-efficiency of your applications.

AKS node scaler

variable "min_node_count" {
type        = number
description = "The minimum number of nodes which should exist in this Node Pool. If specified this must be between 1 and 1000."
default     = 3
}

variable "max_node_count" {
type        = number
description = "The maximum number of nodes which should exist in this Node Pool. If specified this must be between 1 and 1000."
default     = 100
}

Horizontal Pod Autoscaler (Preview)

In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand.

Horizontal scaling means that the response to increased load is to deploy more Pods. This is different from vertical scaling, which for Kubernetes would mean assigning more resources (for example: memory or CPU) to the Pods that are already running for the workload.

More information can be found in the Kubernetes official docs.

The default parameters are:

scaling from 1 to 3 pods (configurable)

Overview​

Manual Scaling:​

Cluster Autoscaler:​

Horizontal Pod Autoscaler (HPA):​

AKS node scaler​

Horizontal Pod Autoscaler (Preview)​