Scalable Cloud-Native Architecture for Automated Anomaly Detection and Intelligent Response in Kubernetes and AKS Platforms
DOI:
https://doi.org/10.62643/ijerst.2025.v21.n2.3146Keywords:
Scalable, Cloud, Anomaly Detection, Intelligent Response, Kubernetes, AKS PlatformsAbstract
The rapid adoption of cloud-native architectures and platforms like Kubernetes and Azure Kubernetes Service (AKS) has introduced unprecedented scalability, but it has also brought immense complexity to system observability. Traditional, static threshold-based monitoring tools are increasingly inadequate for managing highly dynamic, ephemeral microservices, frequently resulting in alert fatigue and prolonged Mean Time to Resolution (MTTR) during critical outages. To address the limitations of reactive, human-in-the-loop engineering operations, this paper proposes a novel, highly scalable cloudnative architecture that seamlessly integrates deep learning-based anomaly detection directly with an automated Kubernetes response engine. Our approach employs a dual-model machine learning pipeline—combining Long Short-Term Memory (LSTM) networks for predictive time-series forecasting and Isolation Forests for real-time outlier detection—embedded within the AKS control plane via custom operators. Upon detecting an anomaly, the system autonomously triggers predefined, RBAC-compliant remediation policies, such as dynamic horizontal scaling or targeted pod restarts. Empirical evaluations within a simulated production environment demonstrate that the proposed architecture achieves a 0.92 F1-score in detection accuracy and slashes remediation times from several minutes to mere seconds. These findings prove that the marginal increase in computational monitoring overhead is decisively outweighed by profound improvements in system reliability and autonomous self-healing capabilities.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.













