Smart City Gnosys

Smart city article details

Title Empower Smart Cities With Sampling-Wise Dynamic Facial Expression Recognition Via Frame-Sequence Contrastive Learning
ID_Doc 22893
Authors Yan S.; Wang Y.; Mai X.; Zhao Q.; Song W.; Huang J.; Tao Z.; Wang H.; Gao S.; Zhang W.
Year 2024
Published Computer Communications, 216
DOI http://dx.doi.org/10.1016/j.comcom.2023.12.032
Abstract In the construction of smart cities, facial expression analysis plays a crucial role. It can be used in traffic monitoring systems to alleviate traffic pressure by analyzing the emotional states of drivers and passengers. In the field of smart healthcare, it can provide more precise treatment and services to patients. In the realm of social entertainment, it can offer more intelligent and personalized interactions. In summary, the application of emotion computing technology will play an increasingly significant role in the development of smart cities in the future. In the task of dynamic facial expression recognition (DFER), analyzing the spatial–temporal features of video sequences has become a common research approach. However, facial expression sequences often contain a significant number of neutral frames and noisy frames, potentially increasing computational costs and reducing performance. Effectively extracting key frames for spatial–temporal feature analysis is a critical aspect of dynamic facial expression recognition. To address this issue, we proposed a sampling-wise dynamic facial expression recognition via frame-Sequence contrastive learning method, called SW-FSCL. The SW-FSCL method aims to improve the performance of DFER by using intelligent dual-stream sampling strategies and frame-sequence contrastive learning, extract key frame and reduce the impact of neutral frames and noisy frames. We proposed a key frame proposal (KFP) block to analyze the spatial–temporal features of sequences, calculating weight ratios for key frame extraction. Due to potential information loss in long sequences, we introduce a temporal aggregation (TA) block to prevent data loss and ensure the integrity of temporal information. The experimental results provide compelling evidence that the proposed approach not only outperforms all current state-of-the-art algorithms on two widely-used benchmark datasets (DFEW, FERV39k), but also visualization results produces insights into the interpretability of the SW-FSCL method. © 2023 Elsevier B.V.
Author Keywords Contrastive learning; Dynamic facial expression recognition; Key frame extraction; Sampling-wise


Similar Articles


Id Similarity Authors Title Published
27190 View0.883Nguyen M.; Yan W.Q.From Faces To Traffic Lights: A Multi-Scale Approach For Emotional State RepresentationProceedings - 2023 IEEE International Conference on High Performance Computing and Communications, Data Science and Systems, Smart City and Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2023 (2023)
39991 View0.859Sudha S.S.; Suganya S.S.On-Road Driver Facial Expression Emotion Recognition With Parallel Multi-Verse Optimizer (Pmvo) And Optical Flow Reconstruction For Partial Occlusion In Internet Of Things (Iot)Measurement: Sensors, 26 (2023)