Person Tracking by Detection Using Dual Visible-Infrared Cameras

We study the problem of cross-modality person reidentification (ReID) and tracking with dual visible-infrared (VI) cameras, while most existing efforts on tracking-by-detection have been paid on single-modality visible ReID which is inapplicable for poor-light environments. The major difficulties fo...

Full description

Saved in:
Bibliographic Details
Published inIEEE internet of things journal Vol. 9; no. 22; pp. 23241 - 23251
Main Authors Geng, Xuewen, Li, Minglei, Liu, Wenping, Zhu, Shengkai, Jiang, Hongbo, Bian, Jiawen, Fan, Xuezhi, Peng, Ruiqing, Luo, Jun
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 15.11.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We study the problem of cross-modality person reidentification (ReID) and tracking with dual visible-infrared (VI) cameras, while most existing efforts on tracking-by-detection have been paid on single-modality visible ReID which is inapplicable for poor-light environments. The major difficulties for cross-modality (e.g., VI) ReID stem from the large modality gap between three-channel visible images and one-channel infrared images and such unknown environmental factors as background clutter, occlusions, etc. To tackle these issues, we propose to enrich the diversities of visible and infrared images for intra- and cross-modality matching by using both the channel-aware data augmentation (DA) techniques (e.g., channel exchanged augmentation and random occlusions) and standard DA techniques. On top of these DA techniques, we incorporate ResNet50 and vision transformer (ViT) into the feature extraction backbone network and apply the dynamic weight average (DWA) strategy for learning loss weights by regarding the minimization of identity loss and triplet loss as a multitask learning problem. We then apply the proposed ReID approach for person tracking in the field of interests. The experiments on two public data sets, i.e., RegDB and SYSU-MM01, show that our approach can improve the performance of state-of-the-art rank-1, mAP, and mINP for cross-modality matching. In addition, the experiments on our data set show that tracking by VI-ReID using dual VI cameras can achieve an accuracy of around 0.24 m.
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2022.3188270