Smart City Gnosys

Smart city article details

Title Integration Vision-Language Models For Feature Extraction In Multi-Camera Multi-Object Tracking
ID_Doc 32233
Authors Trung N.H.; Son T.T.; Van Su T.; Hung P.D.
Year 2025
Published Lecture Notes in Computer Science, 15585 LNAI
DOI http://dx.doi.org/10.1007/978-981-96-4606-7_19
Abstract Multi-camera multi-object tracking is a critical task in various applications such as surveillance, autonomous driving, and smart cities, where accurate and robust tracking of multiple objects across different camera views is essential. This work presents a pipeline for multi-camera multi-object tracking that combines deep learning models, specifically YOLOX and OSNet, with traditional algorithms such as Hungarian algorithm and Kalman filter. A significant enhancement in this pipeline is the incorporation of the CLIP-Reid model for pedestrian feature extraction, leveraging the power of vision-language models. The proposed approach is evaluated by comparing the effectiveness of CLIP-Reid against traditional image-based feature extraction methods on two distinct camera sequences, Laboratory and Terrace, in the EPFL dataset, Multi-camera Pedestrian Videos. The results show a modest improvement in tracking performance, with an increase in IDF1 by 0.4% and MOTA by 0.9% on the Laboratory sequence, and an increase in IDF1 by 7.2% and MOTA by 0.2% on the Terrace sequence, indicating the potential for incremental accuracy gains in complex tracking scenarios. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
Author Keywords camera multi; language models; Multi; Multi; object tracking; object tracking; Vision


Similar Articles


Id Similarity Authors Title Published
17814 View0.854Wang X.; Sun Z.; Chehri A.; Jeon G.; Song Y.Deep Learning And Multi-Modal Fusion For Real-Time Multi-Object Tracking: Algorithms, Challenges, Datasets, And Comparative StudyInformation Fusion, 105 (2024)
38289 View0.851Zhang W.; Xin Y.; Zheng C.; Peng X.; Tai J.Multi-Object Tracking Algorithm Based On Multi-Layer Feature Adaptive FusionProceedings - 2022 2nd International Conference on Frontiers of Electronics, Information and Computation Technologies, ICFEICT 2022 (2022)