Workload Distribution and API Server Optimization for Cloud-Native Scaling in Kubernetes

The rapid adoption of container orchestration platforms, particularly Kubernetes, has revolutionized the deployment and scalability of cloud-native applications. However, as cluster size and workload complexity increase, Kubernetes often faces performance degradation due to inefficient workload dist...

Full description

Saved in:
Bibliographic Details
Published inInternational Journal of Computational and Experimental Science and Engineering Vol. 11; no. 3
Main Authors Mogal, Amit K., Vaibhav P. Sonaje
Format Journal Article
LanguageEnglish
Published 08.07.2025
Online AccessGet full text
ISSN2149-9144
2149-9144
DOI10.22399/ijcesen.2820

Cover

Loading…
More Information
Summary:The rapid adoption of container orchestration platforms, particularly Kubernetes, has revolutionized the deployment and scalability of cloud-native applications. However, as cluster size and workload complexity increase, Kubernetes often faces performance degradation due to inefficient workload distribution and API server bottlenecks. This paper investigates the architectural and operational limitations that emerge in large-scale Kubernetes deployments, with a focus on API server saturation and imbalance in workload scheduling. Drawing from real-world deployment data and synthetic stress-testing, we analyze the scalability thresholds imposed by the Kubernetes control plane, identifying key inefficiencies in the default scheduler and load distribution strategies.To address these challenges, we propose a novel optimization framework that integrates dynamic workload partitioning, intelligent pod-to-node assignment, and API call reduction techniques. Our method leverages asynchronous state propagation and fine-grained node-labeling to enhance scheduler decisions while introducing minimal latency. Experimental evaluation across clusters of varying sizes demonstrates up to 47% improvement in resource utilization, a 35% reduction in API server load, and faster convergence during scale-out events. These results position the proposed solution as a viable enhancement for production-grade Kubernetes environments operating at scale.
ISSN:2149-9144
2149-9144
DOI:10.22399/ijcesen.2820