Implement Auto-Scaling in Kubernetes with kubectl autoscale

Kubernetes provides powerful features for managing containerized applications, including auto-scaling to dynamically adjust the number of running instances based on workload metrics. Using the kubectl autoscale command, you can implement auto-scaling for your deployments, ensuring optimal resource utilization and responsiveness.

Here are several examples demonstrating how to use kubectl autoscale effectively:

Example 1: Auto-scale a deployment named frontend to maintain CPU utilization between 50% and 80%, with a minimum of 2 replicas and a maximum of 5 replicas.

        kubectl autoscale deployment frontend --cpu-percent=50 --min=2 --max=5

Output: Autoscaling is configured for deployment “frontend”.

Verification Steps: Check the current replicas of the deployment using kubectl get deployment frontend and observe the changes in replica count based on CPU load.

Example 2: Set auto-scaling for a deployment named backend based on custom metrics, such as requests per second (rps), targeting an average of 100 rps per pod, with a minimum of 3 replicas and a maximum of 10 replicas.

        kubectl autoscale deployment backend --metric-name=requests-per-second --metric-target=100 --min=3 --max=10

Output: Autoscaling is configured based on custom metric “requests-per-second” for deployment “backend”.

Verification Steps: Monitor the metrics using tools like Prometheus or Grafana to ensure the deployment scales according to the specified rps threshold.

Example 3: Auto-scale a stateful set named database based on memory usage, aiming to keep memory utilization at 70%, with a minimum of 1 replica and a maximum of 3 replicas.

        kubectl autoscale statefulset database --memory-percent=70 --min=1 --max=3

Output: Autoscaling is configured for stateful set “database”.

Verification Steps: Use kubectl describe statefulset database to check the current replica count and monitor memory metrics to confirm auto-scaling behavior.

Example 4: Implement auto-scaling for a deployment named api based on GPU utilization, targeting an average of 80% GPU usage per pod, with a minimum of 2 replicas and a maximum of 6 replicas.

        kubectl autoscale deployment api --resource-name=nvidia.com/gpu --metric-target=80 --min=2 --max=6

Output: Autoscaling is configured based on GPU utilization for deployment “api”.

Verification Steps: Utilize tools like NVIDIA DCGM exporter to monitor GPU metrics and ensure auto-scaling aligns with GPU utilization metrics.

Example 5: Set auto-scaling for a deployment named worker based on external metric values, such as from a custom API endpoint, maintaining an average of 200 requests per minute per pod, with a minimum of 2 replicas and a maximum of 8 replicas.

        kubectl autoscale deployment worker --external-metric-name=requests-per-minute --metric-target=200 --min=2 --max=8

Output: Autoscaling is configured based on external metric “requests-per-minute” for deployment “worker”.

Verification Steps: Monitor the external metric values directly or through an integrated monitoring system to ensure the deployment scales as expected.

Example 6: Auto-scale a deployment named load-balancer based on HTTP request count per second, aiming for an average of 100 requests per second per pod, with a minimum of 3 replicas and a maximum of 12 replicas.

        kubectl autoscale deployment load-balancer --metric-name=http_requests_per_second --metric-target=100 --min=3 --max=12

Output: Autoscaling is configured based on HTTP request count per second for deployment “load-balancer”.

Verification Steps: Use monitoring tools to analyze HTTP request metrics and verify the deployment scales accordingly.

Example 7: Set auto-scaling for a deployment named finance-app based on custom Prometheus metrics, targeting a specific metric query, with a minimum of 4 replicas and a maximum of 10 replicas.

        kubectl autoscale deployment finance-app --metric-name=prometheus --metric-target=custom_metric_query --min=4 --max=10

Output: Autoscaling is configured based on custom Prometheus metrics for deployment “finance-app”.

Verification Steps: Use Prometheus query interface to validate the metric values and observe deployment scaling based on the custom metric query results.

Example 8: Auto-scale a deployment named image-processing based on average network traffic, aiming for 1MBps per pod, with a minimum of 2 replicas and a maximum of 5 replicas.

        kubectl autoscale deployment image-processing --metric-name=network-traffic --metric-target=1MBps --min=2 --max=5

Output: Autoscaling is configured based on network traffic metrics for deployment “image-processing”.

Verification Steps: Monitor network traffic metrics using tools like Istio or Calico and confirm deployment scaling based on observed traffic patterns.

Example 9: Set auto-scaling for a deployment named analytics based on Pod Disruption Budget (PDB) constraints, ensuring availability during maintenance or disruptions, with a minimum of 3 replicas and a maximum of 7 replicas.

        kubectl autoscale deployment analytics --pdb-based --min=3 --max=7

Output: Autoscaling is configured based on Pod Disruption Budget constraints for deployment “analytics”.

Verification Steps: Use kubectl describe deployment analytics to check PDB constraints and ensure deployment replicas adjust accordingly to maintain availability.

Example 10: Auto-scale a deployment named game-server based on custom application metrics, targeting specific thresholds like memory usage per pod, with a minimum of 2 replicas and a maximum of 4 replicas.

        kubectl autoscale deployment game-server --metric-name=custom_app_metric --metric-target=specific_threshold --min=2 --max=4

Output: Autoscaling is configured based on custom application metrics for deployment “game-server”.

Verification Steps: Use application-specific monitoring tools to verify the metric values and observe deployment scaling based on application performance metrics.

Also check similar articles.

Scaling Kubernetes Deployments with kubectl scale
Manage Resource Rollouts with kubectl rollout
Efficiently Delete Kubernetes Resources with kubectl delete
Comprehensive Guide to kubectl get Command
Understanding Kubernetes Resources with kubectl explain

Tags: Kubernetes Kubernetes Commands Examples Kubernetes Commands Tutorial Kubernetes Tutorial

Implement Auto-Scaling in Kubernetes with kubectl autoscale

Scaling Kubernetes Deployments with kubectl scale

Managing Kubernetes Certificates with kubectl certificate

Related You may like!

Interacting with Kubernetes Plugins using kubectl plugin

Configuring kubectl and kubeconfig Files

Exploring Kubernetes API Versions with kubectl api-versions

Understanding Kubernetes API Resources with kubectl api-resources

Generating Shell Completion Code with kubectl completion

Managing Kubernetes Annotations with kubectl annotate

Managing Kubernetes Certificates with kubectl certificate

Access Cluster Information Using kubectl cluster-info

Monitor Resource Usage with kubectl top

Discussion about this post

Latest Updated

How to Use -iname for Case-Insensitive Filename Searches in find

Search for Files with Case-Insensitive Pattern Matching Using -ilname in find

Find Files by Group Name with -group in find Command

Locate Files by Group ID Using -gid in find Command

How to Search for Filesystems with -fstype in find Command

Trending in Week

Dumping BLOBs in Hexadecimal Format with mysqldump

How to Exclude Bad Names when Creating User Accounts in Linux

Using BTRFS Subvolume for User Home Directory in Linux

Managing Kubernetes Certificates with kubectl certificate

Running Docker Images on Kubernetes with kubectl run

Disabling Keys in mysqldump Output

Managing Docker Image Manifests and Lists

Adding Dump Date to mysqldump Output

Updating Kubernetes Labels with kubectl label

Sorting Rows by Primary Key in mysqldump Output