Smart City Gnosys

Smart city article details

Title What Happens In Crowd Scenes: A New Dataset About Crowd Scenes For Image Captioning
ID_Doc 61671
Authors Wang L.; Li H.; Hu W.; Zhang X.; Qiu H.; Meng F.; Wu Q.
Year 2023
Published IEEE Transactions on Multimedia, 25
DOI http://dx.doi.org/10.1109/TMM.2022.3192729
Abstract Making machines endowed with eyes and brains to effectively understand and analyze crowd scenes is of paramount importance for building a smart city to serve people. This is of far-reaching significance for the guidance of dense crowds and accident prevention, such as crowding and stampedes. As a typical multimodal scene understanding task, image captioning has always attracted widespread attention. However, crowd scene understanding captioning is rarely studied due to the unobtainability of related datasets. Therefore, it is difficult to know what happens in crowd scenes. In order to fill this research gap, we propose a crowd scenes caption dataset named CrowdCaption which has the advantages of crowd-topic scenes, comprehensive and complex caption descriptions, typical relationships and detailed grounding annotations. The complexity and diversity of the descriptions and the specificity of the crowd scenes make this dataset extremely challenging to most current methods. Thus, we propose a Multi-hierarchical Attribute Guided Crowd Caption Network (MAGC) based on crowd objects, actions, and status (such as position, dress, posture, etc.) aiming to generate crowd-specific detailed descriptions. We conduct extensive experiments on our CrowdCaption dataset, and our proposed method reaches the state-of-the-art (SoTA) performance. We hope the CrowdCaption dataset can assist future studies related to crowd scenes in the multimodal domain. © 2022 IEEE.
Author Keywords crowd scenes; CrowdCaption; image captioning; multimodal understanding


Similar Articles


Id Similarity Authors Title Published
16685 View0.861Hu Y.; Liu Y.; Cao G.; Wang J.Crowdcl: Unsupervised Crowd Counting Network Via Contrastive LearningIEEE Internet of Things Journal, 12, 12 (2025)