Person Tracking by Detection Using Dual Visible-Infrared Cameras

We study the problem of cross-modality person reidentification (ReID) and tracking with dual visible-infrared (VI) cameras, while most existing efforts on tracking-by-detection have been paid on single-modality visible ReID which is inapplicable for poor-light environments. The major difficulties fo...

Full description

Saved in:

Bibliographic Details
Published in	IEEE internet of things journal Vol. 9; no. 22; pp. 23241 - 23251
Main Authors	Geng, Xuewen, Li, Minglei, Liu, Wenping, Zhu, Shengkai, Jiang, Hongbo, Bian, Jiawen, Fan, Xuezhi, Peng, Ruiqing, Luo, Jun
Format	Journal Article
Language	English
Published	Piscataway IEEE 15.11.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Automobiles Cameras Clutter Computer networks Cross-modality Data augmentation Datasets dual visible-infrared (VI) camera Feature extraction Infrared cameras Infrared imagery Infrared tracking Learning Matching person reidentification (ReID) person tracking Training Trajectory Transformers
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We study the problem of cross-modality person reidentification (ReID) and tracking with dual visible-infrared (VI) cameras, while most existing efforts on tracking-by-detection have been paid on single-modality visible ReID which is inapplicable for poor-light environments. The major difficulties for cross-modality (e.g., VI) ReID stem from the large modality gap between three-channel visible images and one-channel infrared images and such unknown environmental factors as background clutter, occlusions, etc. To tackle these issues, we propose to enrich the diversities of visible and infrared images for intra- and cross-modality matching by using both the channel-aware data augmentation (DA) techniques (e.g., channel exchanged augmentation and random occlusions) and standard DA techniques. On top of these DA techniques, we incorporate ResNet50 and vision transformer (ViT) into the feature extraction backbone network and apply the dynamic weight average (DWA) strategy for learning loss weights by regarding the minimization of identity loss and triplet loss as a multitask learning problem. We then apply the proposed ReID approach for person tracking in the field of interests. The experiments on two public data sets, i.e., RegDB and SYSU-MM01, show that our approach can improve the performance of state-of-the-art rank-1, mAP, and mINP for cross-modality matching. In addition, the experiments on our data set show that tracking by VI-ReID using dual VI cameras can achieve an accuracy of around 0.24 m.
ISSN:	2327-4662 2327-4662
DOI:	10.1109/JIOT.2022.3188270