ShARc: Shape and Appearance Recognition for Person Identification In-the-wild

Identifying individuals in unconstrained video settings is a valuable yet challenging task in biometric analysis due to variations in appearances, environments, degradations, and occlusions. In this paper, we present ShARc, a multimodal approach for video-based person identification in uncontrolled...

Full description

Saved in:

Bibliographic Details
Published in	2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) pp. 6278 - 6288
Main Authors	Zhu, Haidong, Zheng, Wanrong, Zheng, Zhaoheng, Nevatia, Ram
Format	Conference Proceeding
Language	English
Published	IEEE 03.01.2024
Subjects	Algorithms Biometrics Biometrics (access control) body pose Computer vision Data mining Degradation face Feature extraction gesture Shape Skeleton Video recognition and understanding
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Identifying individuals in unconstrained video settings is a valuable yet challenging task in biometric analysis due to variations in appearances, environments, degradations, and occlusions. In this paper, we present ShARc, a multimodal approach for video-based person identification in uncontrolled environments that emphasizes 3-D body shape, pose, and appearance. We introduce two encoders: a Pose and Shape Encoder (PSE) and an Aggregated Appearance Encoder (AAE). PSE encodes the body shape via binarized silhouettes, skeleton motions, and 3-D body shape, while AAE provides two levels of temporal appearance feature aggregation: attention-based feature aggregation and averaging aggregation. For attention-based feature aggregation, we employ spatial and temporal attention to focus on key areas for person distinction. For averaging aggregation, we introduce a novel flattening layer after averaging to extract more distinguishable information and reduce overfitting of attention. We utilize centroid feature averaging for gallery registration. We demonstrate significant improvements over existing state-of-the-art methods on public datasets, including CCVID, MEVID, and BRIAR.
ISSN:	2642-9381
DOI:	10.1109/WACV57701.2024.00617