The State of AWS Cost Optimization in India - Insights from 150+ Professionals on AWS Cost management
Get my free copy
AWS Cost Efficiency

Cut Kubernetes Costs with Smarter Node Utilization on Amazon EKS

Top 5 Expert-Driven Strategies to Reduce Cloud Waste by Up to 40%

As Amazon Elastic Kubernetes Service (EKS) becomes a popular choice for running Kubernetes in the cloud, organizations often overlook an important aspect of cost optimization: node utilization. While EKS offers ease of scaling and management, inefficient node usage can lead to significant cost overruns. 

For many EKS clusters, wasted resources result from over provisioned pods, idle nodes, and poor scheduling, leading to unnecessary charges. By optimizing node utilization, organizations can cut costs by up to 40% while ensuring high availability and scalability for their applications.

In this blog, we will explore best practices to optimize node usage in EKS clusters and reduce cloud costs.

What is Node Utilization in Kubernetes?

Node utilization in EKS refers to how efficiently the worker nodes in your cluster are used to handle workloads. In Amazon EKS, nodes are EC2 instances that run the Kubernetes workloads (pods). Poor node utilization occurs when:

  • Nodes are underutilized (e.g., nodes running at low CPU or memory usage)
  • Pods are overprovisioned, consuming more resources than needed
  • Idle or unused nodes remain running and incur unnecessary charges

Effective node utilization ensures that your applications are running on the minimal number of nodes needed to meet their resource requirements without overprovisioning.

Strategies for Optimizing Node Utilization in Kubernetes

1. Select the appropriate EC2 instance type for your EKS nodes

Selecting the appropriate EC2 instance types for your EKS nodes is crucial for achieving cost efficiency. Amazon EKS offers the flexibility to use a variety of EC2 instances for your worker nodes, but improper selection can lead to either underutilized resources or performance bottlenecks. To optimize costs without compromising performance, it is important to analyze resource usage by monitoring CPU and memory utilization using Amazon CloudWatch and EKS metrics. Based on these insights, you should select instance types that align with your workload requirements. 

For example, smaller workloads can benefit from cost-effective instances such as t3.micro or t3.medium, while memory-intensive applications may require instances like r5.xlarge to ensure optimal performance.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: my-cluster
  region: us-west-2

nodeGroups:
  - name: small-nodegroup
    instanceType: t3.medium
    desiredCapacity: 2

By selecting the right EC2 instance types based on workload characteristics, you can ensure that your nodes are neither over-provisioned nor underutilized.

2. Enable Cluster Autoscaler and Use EC2 Spot Instances

Amazon EC2 Spot Instances allow you to utilize unused EC2 capacity at up to 90% lower cost than On-Demand Instances. They are ideal for fault-tolerant, stateless, or flexible workloads that can withstand interruptions. Leveraging Spot Instances can significantly reduce compute costs in your EKS cluster.

Before using Spot Instances effectively, ensure your EKS cluster can dynamically scale based on workload demands by enabling and configuring the Cluster Autoscaler. The autoscaler watches for pods that can’t launch due to insufficient resources and adjusts node group sizes accordingly. Proper tuning ensures your cluster is right-sized and avoids overprovisioning, setting a strong foundation for cost-efficient Spot usage.

Important: Spot Instances can be interrupted with a two-minute warning when capacity is reclaimed by AWS. Avoid using Spot for critical workloads that require high availability or cannot tolerate disruptions.

How to Use Spot Instances in EKS:

  • Mixed Node Groups: Create node groups mixing On-Demand and Spot Instances to balance cost and availability.
  • Karpenter: Use Karpenter for automatic provisioning of Spot Instances in your EKS cluster.

Example of mixed node group setup:

nodeGroups:
  - name: mixed-nodegroup
    instanceTypes:
      - t3.medium
      - t3a.medium
    desiredCapacity: 4
    minSize: 2
    maxSize: 6
    capacityType: "SPOT"
    spotPrice: "0.035"
  
  - name: on-demand-nodegroup
    instanceTypes:
      - t3.medium
    desiredCapacity: 2
    minSize: 1
    maxSize: 3
    capacityType: "ON_DEMAND"

By blending On-Demand and Spot Instances strategically, you can drastically reduce compute costs while still ensuring that critical workloads remain stable.

3. Optimize Resource Requests and Limits

Many Kubernetes workloads in Amazon EKS are overprovisioned due to high resource requests, which leads to inefficient use of resources and increased costs. Properly setting resource requests and limits ensures that workloads use only the resources they need, allowing the cluster to operate more efficiently. Begin by monitoring actual resource usage through Amazon CloudWatch or Prometheus to understand how much CPU and memory pods truly require. 

With this data, set resource requests conservatively, starting from lower values and adjusting upward as needed based on real usage trends. Incorporating Horizontal Pod Autoscaler (HPA) allows the number of pod replicas to increase or decrease automatically depending on the workload, which helps in maintaining application performance while controlling resource consumption.

resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
  limits:
    cpu: "1000m"
    memory: "1Gi"

Properly setting resource requests ensures that your EKS nodes are packed efficiently, reducing waste and cost.

4. Use Topology Spread Constraints for Better Pod Distribution

Topology spread constraints are Kubernetes scheduling policies that help ensure pods are evenly distributed across specified topology domains, such as availability zones or individual nodes. This even distribution improves fault tolerance and availability while also enhancing cost efficiency. By avoiding pod concentration in a single zone or node, topology spread constraints help prevent situations where some nodes are overburdened and others remain idle. 

This leads to better resource utilization across the cluster and reduces the risk of underutilized infrastructure, ultimately lowering operational costs and improving resilience during zonal failures or disruptions.

topologySpreadConstraints:
- maxSkew: 1
  topologyKey: "topology.kubernetes.io/zone"
  whenUnsatisfiable: ScheduleAnyway
  labelSelector:
    matchLabels:
      app: my-app

By balancing pod distribution, you can maximize the utilization of all nodes, improving overall cluster efficiency.

5. Use Amazon EKS Managed Node Groups

Amazon EKS Managed Node Groups

Amazon EKS Managed Node Groups (MNGs) simplify the process of provisioning and managing worker nodes by automating the creation, updating, and termination of EC2 instances in your Kubernetes cluster. Instead of manually handling EC2 node setup and maintenance, MNGs allow you to define your node group configuration declaratively, and Amazon EKS takes care of the rest. This significantly reduces operational overhead and ensures that your nodes are correctly configured with appropriate IAM roles, security groups, and networking settings. 

Managed Node Groups also integrate natively with the Cluster Autoscaler, allowing your cluster to automatically scale up or down based on workload demand. This ensures that resources are used efficiently and idle infrastructure is minimized, directly contributing to cost savings. Furthermore, MNGs support rolling updates, enabling you to apply security patches or Kubernetes version upgrades without downtime. 

Case Study: How FinPeak Analytics Saved Over 70% on EKS Costs

FinPeak Analytics is a fintech startup running real-time analytics and reporting platforms on Amazon EKS. Their workloads include data processing jobs, REST APIs, and batch reports. Their DevOps team observed escalating AWS bills , particularly from underutilized EC2 nodes in their Kubernetes clusters.

Problem: Overprovisioned EKS Cluster

FinPeak initially ran an EKS cluster with 12 m5.large On-Demand nodes, each costing $0.096/hour in the US-East-1 region.

  • Hourly cost: 12 × $0.096 = $1.152/hour
  • Monthly cost (730 hours): $1.152 × 730 = $841.00/month

Upon reviewing CloudWatch metrics, they discovered:

  • Pods were overprovisioned (requests > actual usage)
  • Many nodes were only 30–40% utilized
  • No autoscaler was in place

Solution: Node Utilization Optimization

The FinPeak team implemented the following:

  1. Right-sized resource requests/limits using historical usage data
  2. Enabled Cluster Autoscaler to scale down idle nodes
  3. Introduced 50% Spot Instances using mixed node groups
  4. Switched instance type from m5.large to cost-effective t3.medium
  5. Used Karpenter to autoscale based on actual pod needs

New Setup:

  • 6 t3.medium On-Demand @ $0.0416/hour
  • 6 t3.medium Spot @ $0.0125/hour
  • Hourly cost: (6 × $0.0416) + (6 × $0.0125) = $0.3246/hour
  • Monthly cost: $0.3246 × 730 = $237.96/month

Results: 71.7% Cost Reduction

Metric Before Optimization After Optimization
Instance Type m5.large (On-Demand) t3.medium (Mixed)
Nodes 12 12
Monthly Cost $841.00 $237.96
Monthly Savings $603.04
Utilization Efficiency ~35% ~75%+

By right-sizing pods, introducing Spot capacity, and using intelligent autoscaling, FinPeak Analytics saved over $600/month ,a 71.7% cost reduction without sacrificing availability or performance.

Conclusion

Optimizing node utilization in Amazon EKS is one of the most effective ways to reduce cloud costs and improve operational efficiency. By selecting the right EC2 instance types, leveraging Spot Instances, tuning Cluster Autoscaler, and implementing best practices like topology spread constraints, you can significantly reduce your cloud waste and cut Kubernetes costs by up to 40%.

References

1. What is Amazon EKS

2. Amazon EKS Managed Node Groups

3. Cluster Autoscaler on Amazon EKS  

4. Using EC2 Spot Instances with Amazon EKS

5. Karpenter - Just-in-Time Nodes for Kubernetes

6. Monitoring Amazon EKS with CloudWatch

7. Amazon EC2 Instance Types

8. Topology Spread Constraints

Subscribed !
Your information has been submitted
Oops! Something went wrong while submitting the form.

Similar Blog Posts

Maintain Control and Curb Wasted Spend!

Strategical use of SCPs saves more cloud cost than one can imagine. Astuto does that for you!