| Abstract |
The crowd density estimation has recently attracted attention from various industries. For efficient computing in the real-time applications, this paper proposes a lightweight relation-aware network based on self-contrastive distillation for red‒green‒blue, thermal (RGB-T) crowd counting. In order to better deal with irregular area images taken by drones, a novel multi-relationship graph reasoning module is designed in the network. By modeling and reasoning the graph node relationship on multi-layer, deep interaction in resolution is achieved from the pixel level, and rich multimodal information is obtained. Furthermore, to make full use of the information on self-attention, the gaussian selection module is introduced, in which gaussian filtering is used to obtain valuable information and convert it into a probabilistic form for further optimized modal communication. This paper also proposes a new cyclic contrast distillation training method. By dividing positive samples and negative samples for comparative learning and training, the intermediate features are optimized to influence the final output, and the final output is then transferred to the underlying features for self-distillation. In this way, the intermediate information is strengthened and a new training cycle is formed. The feature expression of each layer is greatly optimized without increasing parameters. Finally, a large number of experiments have shown that our model performs well on the RGB-T crowd dataset. © 2014 IEEE. |