CFDA-CSF: A Multi-Modal Domain Adaptation Method for Cross-Subject Emotion Recognition

Multi-modal classifiers for emotion recognition have become prominent, as the emotional states of subjects can be more comprehensively inferred from Electroencephalogram (EEG) signals and eye movements. However, existing classifiers experience a decrease in performance due to the distribution shift...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on affective computing Vol. 15; no. 3; pp. 1502 - 1513
Main Authors Jimenez-Guarneros, Magdiel, Fuentes-Pineda, Gibran
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.07.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Multi-modal classifiers for emotion recognition have become prominent, as the emotional states of subjects can be more comprehensively inferred from Electroencephalogram (EEG) signals and eye movements. However, existing classifiers experience a decrease in performance due to the distribution shift when applied to new users. Unsupervised domain adaptation (UDA) emerges as a solution to address the distribution shift between subjects by learning a shared latent feature space. Nevertheless, most UDA approaches focus on a single modality, while existing multi-modal approaches do not consider that fine-grained structures should also be explicitly aligned and the learned feature space must be discriminative. In this paper, we propose Coarse and Fine-grained Distribution Alignment with Correlated and Separable Features (CFDA-CSF), which performs a coarse alignment over the global feature space, and a fine-grained alignment between modalities from each domain distribution. At the same time, the model learns intra-domain correlated features, while a separable feature space is encouraged on new subjects. We conduct an extensive experimental study across the available sessions on three public datasets for multi-modal emotion recognition: SEED, SEED-IV, and SEED-V. Our proposal effectively improves the recognition performance in every session, achieving an average accuracy of 93.05%, 85.87% and 91.20% for SEED; 85.72%, 89.60%, and 86.88% for SEED-IV; and 88.49%, 91.37% and 91.57% for SEED-V.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1949-3045
1949-3045
DOI:10.1109/TAFFC.2024.3357656