POST: Prototype‐oriented similarity transfer framework for cross‐domain facial expression recognition

Facial expression recognition (FER) is one of the popular research topics in computer vision. Most deep learning expression recognition methods perform well on a single dataset, but may struggle in cross‐domain FER applications when applied to different datasets. FER under cross‐dataset also suffers...

Full description

Saved in:
Bibliographic Details
Published inComputer animation and virtual worlds Vol. 35; no. 3
Main Authors Guo, Zhe, Wei, Bingxin, Cai, Qinglin, Liu, Jiayi, Wang, Yi
Format Journal Article
LanguageEnglish
Published Chichester Wiley Subscription Services, Inc 01.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Facial expression recognition (FER) is one of the popular research topics in computer vision. Most deep learning expression recognition methods perform well on a single dataset, but may struggle in cross‐domain FER applications when applied to different datasets. FER under cross‐dataset also suffers from difficulties such as feature distribution deviation and discriminator degradation. To address these issues, we propose a prototype‐oriented similarity transfer framework (POST) for cross‐domain FER. The bidirectional cross‐attention Swin Transformer (BCS Transformer) module is designed to aggregate local facial feature similarities across different domains, enabling the extraction of relevant cross‐domain features. The dual learnable category prototypes is designed to represent potential space samples for both source and target domains, ensuring enhanced domain alignment by leveraging both cross‐domain and specific domain features. We further introduce the self‐training resampling (STR) strategy to enhance similarity transfer. The experimental results with the RAF‐DB dataset as the source domain and the CK+, FER2013, JAFFE and SFEW 2.0 datasets as the target domains, show that our approach achieves much higher performance than the state‐of‐the‐art cross‐domain FER methods. In this paper, we proposed a prototype‐oriented similarity transfer framework (POST) for cross‐domain facial expression recognition. The bidirectional cross‐attention Swin Transformer (BCS Transformer) module is designed to aggregate local facial feature similarities across different domains. The dual learnable category prototypes is designed to represent potential space samples for both source and target domains. The self‐training resampling (STR) strategy is further introduced to enhance similarity transfer.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1546-4261
1546-427X
DOI:10.1002/cav.2260