POST: Prototype‐oriented similarity transfer framework for cross‐domain facial expression recognition
Facial expression recognition (FER) is one of the popular research topics in computer vision. Most deep learning expression recognition methods perform well on a single dataset, but may struggle in cross‐domain FER applications when applied to different datasets. FER under cross‐dataset also suffers...
Saved in:
Published in | Computer animation and virtual worlds Vol. 35; no. 3 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Chichester
Wiley Subscription Services, Inc
01.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Facial expression recognition (FER) is one of the popular research topics in computer vision. Most deep learning expression recognition methods perform well on a single dataset, but may struggle in cross‐domain FER applications when applied to different datasets. FER under cross‐dataset also suffers from difficulties such as feature distribution deviation and discriminator degradation. To address these issues, we propose a prototype‐oriented similarity transfer framework (POST) for cross‐domain FER. The bidirectional cross‐attention Swin Transformer (BCS Transformer) module is designed to aggregate local facial feature similarities across different domains, enabling the extraction of relevant cross‐domain features. The dual learnable category prototypes is designed to represent potential space samples for both source and target domains, ensuring enhanced domain alignment by leveraging both cross‐domain and specific domain features. We further introduce the self‐training resampling (STR) strategy to enhance similarity transfer. The experimental results with the RAF‐DB dataset as the source domain and the CK+, FER2013, JAFFE and SFEW 2.0 datasets as the target domains, show that our approach achieves much higher performance than the state‐of‐the‐art cross‐domain FER methods.
In this paper, we proposed a prototype‐oriented similarity transfer framework (POST) for cross‐domain facial expression recognition. The bidirectional cross‐attention Swin Transformer (BCS Transformer) module is designed to aggregate local facial feature similarities across different domains. The dual learnable category prototypes is designed to represent potential space samples for both source and target domains. The self‐training resampling (STR) strategy is further introduced to enhance similarity transfer. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1546-4261 1546-427X |
DOI: | 10.1002/cav.2260 |