Smart City Gnosys

Smart city article details

Title Environmental Sound Classification With Low-Complexity Convolutional Neural Network Empowered By Sparse Salient Region Pooling
ID_Doc 24260
Authors Seresht H.R.; Mohammadi K.
Year 2023
Published IEEE Access, 11
DOI http://dx.doi.org/10.1109/ACCESS.2022.3232807
Abstract Environmental Sound Classification (ESC) is an important field in a broad range of applications, such as smart cities, audio surveillance, and health care. Recently, Convolutional Neural Networks (CNNs) have taken the lead from traditional approaches and have produced promising results. However, the achieved improvements are often accompanied by increasing depth, complexity, and size of the network, which prevents their usage in many practical applications. In this work, our goal is to empower a small-size low-complexity CNN model to achieve superior performance. To this end, we concentrate on the importance of global pooling technique, which is less investigated in ESC. In most previous works, models utilize global average pooling layer which does not consider regional saliency, and thus weakens the salient time-frequency regions contributions to the classification, and also to the training of convolutional kernels. We propose a novel global pooling method, called Sparse Salient Region Pooling (SSRP), which computes the channel descriptors using a sparse subset of features, and guides the model to effectively learn from the more salient time-frequency regions. Experimental results demonstrate that the proposed model with only 700K parameters yields accuracies of 86.7% on ESC-50 and 94.8% on ESC-10, which are comparable to that of the state-of-the-art methods. Compared to the baseline model, our model achieves absolute improvement of 21.8% in accuracy on ESC-50, with 98% smaller model size. Our visual analyses show that SSRP intensifies the responses of low-energy regions such that they contribute even more than high-energy regions to the classification of specific sound classes. © 2013 IEEE.
Author Keywords Convolutional neural networks; environmental sound classification; global feature pooling; low complexity; regional saliency


Similar Articles


Id Similarity Authors Title Published
14555 View0.938Seker H.; Inik O.Cnnsound: Convolutional Neural Networks For The Classification Of Environmental SoundsACM International Conference Proceeding Series (2020)
24259 View0.907Das J.K.; Chakrabarty A.; Piran M.J.Environmental Sound Classification Using Convolution Neural Networks With Different Integrated Loss FunctionsExpert Systems, 39, 5 (2022)
14548 View0.895İnik Ö.Cnn Hyper-Parameter Optimization For Environmental Sound ClassificationApplied Acoustics, 202 (2023)
14284 View0.895Reddy B.S.; Chowdary D.M.; Srinivas R.; Rahmani M.O.Classification Of Environmental And Urban Sounds Using Deep Learning Techniques4th IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics, ICDCECE 2025 (2025)
26145 View0.888Fang Z.; Yin B.; Du Z.; Huang X.Fast Environmental Sound Classification Based On Resource Adaptive Convolutional Neural NetworkScientific Reports, 12, 1 (2022)
48795 View0.887Liu Z.; Yeh W.-C.Simplified Swarm Optimisation For Cnn Hyperparameters: A Sound Classification ApproachInternational Journal of Web and Grid Services, 20, 1 (2024)
14305 View0.885Agarwal M.; Gill K.S.; Chattopadhyay S.; Singh M.Classification Of Urban Sound Using Sequential Convolutional Neural Network (Cnn) Model And Its Visualisation2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems, ICITEICS 2024 (2024)
60186 View0.875Agarwal M.; Gill K.S.; Aggarwal P.; Rawat R.S.; Sunil G.Urban Sound Classification Using Vgg19 Convolutional Neural Network (Cnn) Model And Its Visualisation4th International Conference on Innovative Practices in Technology and Management 2024, ICIPTM 2024 (2024)
58279 View0.874Vijay M.; Ruthwik Saran K.; Reddy K.R.; Aditya Ram K.; Babu J.Y.Towards Robust Environmental Sound Classification: A Deep Learning Approach Leveraging Time-Frequency Representations2nd International Conference on Emerging Research in Computational Science, ICERCS 2024 (2024)
24672 View0.871Lamrini M.; Chkouri M.Y.; Touhafi A.Evaluating The Performance Of Pre-Trained Convolutional Neural Network For Audio Classification On Embedded Systems For Anomaly Detection In Smart CitiesSensors, 23, 13 (2023)