Rank-Based Verification for Long-Term Face Tracking in Crowded Scenes

Most current multi-object trackers focus on short-term tracking, and are based on deep and complex systems that often cannot operate in real-time, making them impractical for video-surveillance. In this paper we present a long-term, multi-face tracking architecture conceived for working in crowded c...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on biometrics, behavior, and identity science Vol. 3; no. 4; pp. 495 - 505
Main Authors	Barquero, German, Hupont, Isabelle, Fernandez Tena, Carles
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.10.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Complex systems Face recognition face tracking face verification Faces Long-term tracking Machine learning Modules rank-based verification Real-time systems Target tracking Tracking Verification Video surveillance
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Most current multi-object trackers focus on short-term tracking, and are based on deep and complex systems that often cannot operate in real-time, making them impractical for video-surveillance. In this paper we present a long-term, multi-face tracking architecture conceived for working in crowded contexts where faces are often the only visible part of a person. Our system benefits from advances in the fields of face detection and face recognition to achieve long-term tracking, and is particularly unconstrained to the motion and occlusions of people. It follows a tracking-by-detection approach, combining a fast short-term visual tracker with a novel online tracklet reconnection strategy grounded on rank-based face verification. The proposed rank-based constraint favours higher inter-class distance among tracklets, and reduces the propagation of errors due to wrong reconnections. Additionally, a correction module is included to correct past assignments with no extra computational cost. We present a series of experiments introducing novel specialized metrics for the evaluation of long-term tracking capabilities, and publicly release a video dataset with 10 manually annotated videos and a total length of 8' 54". Our findings validate the robustness of each of the proposed modules, and demonstrate that, in these challenging contexts, our approach yields up to 50% longer tracks than state-of-the-art deep learning trackers.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2637-6407 2637-6407
DOI:	10.1109/TBIOM.2021.3099568