Pose Calibrated Feature Aggregation for Face Set Recognition

This paper presents Pose Calibrated Feature Aggregation Network (PCFAN), an architecture for set/video face recognition. Using stacked attention blocks and a multi-stream architecture, it automatically assigns adaptive weights to every instance in the set, based on both the recognition embeddings an...

Full description

Saved in:
Bibliographic Details
Published inProceedings - International Conference on Image Processing pp. 161 - 165
Main Authors Hasani, Ibrahim, Arif, Omar
Format Conference Proceeding
LanguageEnglish
Published IEEE 16.10.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper presents Pose Calibrated Feature Aggregation Network (PCFAN), an architecture for set/video face recognition. Using stacked attention blocks and a multi-stream architecture, it automatically assigns adaptive weights to every instance in the set, based on both the recognition embeddings and the associated face metadata. It uses these weights to produce a single, compact feature vector for the set. The model automatically learns to advocate for features from images with more favorable qualities and poses, which inherently hold more information. Our block can be inserted on top of any standard recognition model for set prediction and improved performance, particularly in unconstrained scenarios where subject pose and image quality vary considerably between frames. We test our approach on two challenging video face-recognition datasets, IJB-A and IJB-B to report state-of-the-art results. Moreover, a comparison with top aggregation methods as our baselines demonstrates that PCFAN is the superior approach.
ISSN:2381-8549
DOI:10.1109/ICIP46576.2022.9897771