RGB-D Face Recognition With Identity-Style Disentanglement and Depth Augmentation

Deep learning approaches achieve highly accurate face recognition by training the models with huge face image datasets. Unlike 2D face image datasets, there is a lack of large 3D face datasets available to the public. Existing public 3D face datasets were usually collected with few subjects, leading...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on biometrics, behavior, and identity science Vol. 5; no. 3; pp. 334 - 347
Main Authors Chiu, Meng-Tzu, Cheng, Hsun-Ying, Wang, Chien-Yi, Lai, Shang-Hong
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.07.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Deep learning approaches achieve highly accurate face recognition by training the models with huge face image datasets. Unlike 2D face image datasets, there is a lack of large 3D face datasets available to the public. Existing public 3D face datasets were usually collected with few subjects, leading to the over-fitting problem. This paper proposes two CNN models to improve the RGB-D face recognition task. The first is a segmentation-aware depth estimation network, called DepthNet, which estimates depth maps from RGB face images by exploiting semantic segmentation for more accurate face region localization. The other is a novel segmentation-guided RGB-D face recognition model that contains an RGB recognition branch, a depth map recognition branch, and an auxiliary segmentation mask branch. In our multi-modality face recognition model, a feature disentanglement scheme is employed to factorize the feature representation into identity-related and style-related components. DepthNet is applied to augment a large 2D face image dataset to a large RGB-D face dataset, which is used for training our RGB-D face recognition model. Our experimental results show that DepthNet can produce more reliable depth maps from face images with the segmentation mask. Our multi-modality face recognition model fully exploits the depth map and outperforms state-of-the-art methods on several public 3D face datasets with challenging variations.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2637-6407
2637-6407
DOI:10.1109/TBIOM.2022.3233769