Smart City Gnosys

Smart city article details

Title An Ensemble Of Convolutional Neural Networks For Sound Event Detection
ID_Doc 8076
Authors Mukhamadiyev A.; Khujayarov I.; Nabieva D.; Cho J.
Year 2025
Published Mathematics, 13, 9
DOI http://dx.doi.org/10.3390/math13091502
Abstract Sound event detection tasks are rapidly advancing in the field of pattern recognition, and deep learning methods are particularly well suited for such tasks. One of the important directions in this field is to detect the sounds of emotional events around residential buildings in smart cities and quickly assess the situation for security purposes. This research presents a comprehensive study of an ensemble convolutional recurrent neural network (CRNN) model designed for sound event detection (SED) in residential and public safety contexts. The work focuses on extracting meaningful features from audio signals using image-based representation, such as Discrete Cosine Transform (DCT) spectrograms, Cocheagrams, and Mel spectrograms, to enhance robustness against noise and improve feature extraction. In collaboration with police officers, a two-hour dataset consisting of 112 clips related to four classes of emotional sounds, such as harassment, quarrels, screams, and breaking sounds, was prepared. In addition to the crowdsourced dataset, publicly available datasets were used to broaden the study’s applicability. Our dataset contains 5055 audio files of different lengths totaling 14.14 h and strongly labeled data. The dataset consists of 13 separate sound categories. The proposed CRNN model integrates spatial and temporal feature extraction by processing these spectrograms through convolution and bi-directional gated recurrent unit (GRU) layers. An ensemble approach combines predictions from three models, achieving F1 scores of 71.5% for segment-based metrics and 46% for event-based metrics. The results demonstrate the model’s effectiveness in detecting sound events under noisy conditions, even with a small, unbalanced dataset. This research highlights the potential of the model for real-time audio surveillance systems using mini-computers, offering cost-effective and accurate solutions for maintaining public order. © 2025 by the authors.
Author Keywords audio signal; convolution neural network (CNN); data augmentation; DCT; deep learning; ensemble of classifiers; Mel; pattern recognition; smart city; sound event detection


Similar Articles


Id Similarity Authors Title Published
17908 View0.912Ciaburro G.Deep Learning Methods For Audio Events DetectionStudies in Big Data, 82 (2021)
60187 View0.896Lakshmi R.; Chaitra N.C.; Thejaswini R.; Swapna H.; Parameshachari B.D.; Kumar S.D.S.; Puttegowda K.Urban Sound Classification With Convolutional Neural Network2nd IEEE International Conference on Integrated Intelligence and Communication Systems, ICIICS 2024 (2024)
19236 View0.892Hattaraki S.M.; Kambalimath S.G.; Savukar B.P.; Bagali S.; Dixit U.D.; Jadhav A.S.Detection And Classification Of Various Listening Environments For Hearing-Impaired Individuals Using Crnn2024 International Conference on Innovation and Novelty in Engineering and Technology, INNOVA 2024 - Proceedings (2024)
52318 View0.891Nogueira A.F.R.; Oliveira H.S.; Machado J.J.M.; Tavares J.M.R.S.Sound Classification And Processing Of Urban Environments: A Systematic Literature ReviewSensors, 22, 22 (2022)
14305 View0.89Agarwal M.; Gill K.S.; Chattopadhyay S.; Singh M.Classification Of Urban Sound Using Sequential Convolutional Neural Network (Cnn) Model And Its Visualisation2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems, ICITEICS 2024 (2024)
59656 View0.89Hidayat A.; Njoo D.B.P.; Adrian G.D.; Setyoko D.E.; Wijanarko B.D.Unlocking Soundscapes: Harnessing Machine Learning For Sound ClassificationProceeding of 2024 9th International Conference on Information Technology and Digital Applications, ICITDA 2024 (2024)
3614 View0.886Mohmmad S.; Sanampudi S.K.A Parametric Survey On Polyphonic Sound Event Detection And LocalizationMultimedia Tools and Applications, 84, 20 (2025)
50556 View0.885Shabbir A.; Cheema A.N.; Ullah I.; Almanjahie I.M.; Alshahrani F.Smart City Traffic Management: Acoustic-Based Vehicle Detection Using Stacking-Based Ensemble Deep Learning ApproachIEEE Access, 12 (2024)
60186 View0.882Agarwal M.; Gill K.S.; Aggarwal P.; Rawat R.S.; Sunil G.Urban Sound Classification Using Vgg19 Convolutional Neural Network (Cnn) Model And Its Visualisation4th International Conference on Innovative Practices in Technology and Management 2024, ICIPTM 2024 (2024)
44289 View0.881Saradopoulos I.; Potamitis I.; Ntalampiras S.; Rigakis I.; Manifavas C.; Konstantaras A.Real-Time Acoustic Detection Of Critical Incidents In Smart Cities Using Artificial Intelligence And Edge NetworksSensors, 25, 8 (2025)