DEEP LEARNING FRAMEWORK FOR REAL-TIME SIREN DETECTION IN URBAN AUDIO STREAMS
DOI:
https://doi.org/10.62643/ijerst.2026.v22.n2(2).2913Keywords:
Emergency Siren Detection, Audio Surveillance, Smart City Applications, Deep Learning, Neuro Fusion Model (NFM).Abstract
Rapid urbanization and the expansion of intelligent transportation systems have intensified the demand for real-time audio surveillance, with recent smart city deployments reporting significant reliance on automated acoustic monitoring for emergency response and traffic management. Studies indicate that delayed detection of emergency sirens in congested urban environments can increase response times and compromise public safety, highlighting the necessity for robust real-time audio analytics. Traditional siren and alarm detection systems predominantly rely on fixed threshold-based frequency or amplitude analysis, which operates effectively only under controlled acoustic conditions. However, these systems lack adaptability and often degrade in performance when exposed to environmental noise, overlapping sound sources, and signal variability. The core problem lies in developing a noise-resilient, adaptive, and computationally efficient system capable of accurately detecting and classifying critical audio patterns across diverse real-world scenarios. This research presents an intelligent system for detecting emergency vehicle sirens using machine learning and deep learning techniques. The system begins with audio preprocessing steps including mono conversion and resampling to ensure consistent signal quality. Acoustic features are then extracted using Mel Frequency Cepstral Coefficients (MFCC) and Chroma features, which capture important frequency and pitch characteristics of siren sounds. These features are used to train several classification models including Generalized Learning Vector Quantization (GLVQ), Perceptron, and a Multi-Layer Perceptron neural network (MLP). In addition, a hybrid model combining a Deep Neural Network with a Perceptron classifier (DNN-Perceptron) also known as Neuro Fusion model (NFM) is proposed to improve classification performance. The models are evaluated using accuracy, precision, recall, and F-score along with confusion matrices and ROC curves. Experimental results show that the NFmodel achieves the highest accuracy. The system also includes a graphical interface for dataset upload, training, and real-time siren prediction.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.













