Scalable Cloud-Native Architecture for Automated Anomaly Detection and Intelligent Response in Kubernetes and AKS Platforms

Authors

  • Hing-Yan Lee Author

DOI:

https://doi.org/10.62643/ijerst.2025.v21.n2.3146

Keywords:

Scalable, Cloud, Anomaly Detection, Intelligent Response, Kubernetes, AKS Platforms

Abstract

The rapid adoption of cloud-native architectures and platforms like Kubernetes and Azure Kubernetes Service (AKS) has introduced unprecedented scalability, but it has also brought immense complexity to system observability. Traditional, static threshold-based monitoring tools are increasingly inadequate for managing highly dynamic, ephemeral microservices, frequently resulting in alert fatigue and prolonged Mean Time to Resolution (MTTR) during critical outages. To address the limitations of reactive, human-in-the-loop engineering operations, this paper proposes a novel, highly scalable cloudnative architecture that seamlessly integrates deep learning-based anomaly detection directly with an automated Kubernetes response engine. Our approach employs a dual-model machine learning pipeline—combining Long Short-Term Memory (LSTM) networks for predictive time-series forecasting and Isolation Forests for real-time outlier detection—embedded within the AKS control plane via custom operators. Upon detecting an anomaly, the system autonomously triggers predefined, RBAC-compliant remediation policies, such as dynamic horizontal scaling or targeted pod restarts. Empirical evaluations within a simulated production environment demonstrate that the proposed architecture achieves a 0.92 F1-score in detection accuracy and slashes remediation times from several minutes to mere seconds. These findings prove that the marginal increase in computational monitoring overhead is decisively outweighed by profound improvements in system reliability and autonomous self-healing capabilities.

Downloads

Published

17-06-2025

How to Cite

Scalable Cloud-Native Architecture for Automated Anomaly Detection and Intelligent Response in Kubernetes and AKS Platforms. (2025). International Journal of Engineering Research and Science & Technology, 21(2), 3088-3095. https://doi.org/10.62643/ijerst.2025.v21.n2.3146