CFDA-CSF: A Multi-Modal Domain Adaptation Method for Cross-Subject Emotion Recognition

Multi-modal classifiers for emotion recognition have become prominent, as the emotional states of subjects can be more comprehensively inferred from Electroencephalogram (EEG) signals and eye movements. However, existing classifiers experience a decrease in performance due to the distribution shift...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on affective computing Vol. 15; no. 3; pp. 1502 - 1513
Main Authors	Jimenez-Guarneros, Magdiel, Fuentes-Pineda, Gibran
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.07.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptation Alignment Brain modeling Correlation Deep learning electroencephalogram Electroencephalography Emotion recognition Emotional factors Emotions Eye movements eye tracking Feature recognition multi-modal domain adaptation Proposals Signal classification Task analysis Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Multi-modal classifiers for emotion recognition have become prominent, as the emotional states of subjects can be more comprehensively inferred from Electroencephalogram (EEG) signals and eye movements. However, existing classifiers experience a decrease in performance due to the distribution shift when applied to new users. Unsupervised domain adaptation (UDA) emerges as a solution to address the distribution shift between subjects by learning a shared latent feature space. Nevertheless, most UDA approaches focus on a single modality, while existing multi-modal approaches do not consider that fine-grained structures should also be explicitly aligned and the learned feature space must be discriminative. In this paper, we propose Coarse and Fine-grained Distribution Alignment with Correlated and Separable Features (CFDA-CSF), which performs a coarse alignment over the global feature space, and a fine-grained alignment between modalities from each domain distribution. At the same time, the model learns intra-domain correlated features, while a separable feature space is encouraged on new subjects. We conduct an extensive experimental study across the available sessions on three public datasets for multi-modal emotion recognition: SEED, SEED-IV, and SEED-V. Our proposal effectively improves the recognition performance in every session, achieving an average accuracy of 93.05%, 85.87% and 91.20% for SEED; 85.72%, 89.60%, and 86.88% for SEED-IV; and 88.49%, 91.37% and 91.57% for SEED-V.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1949-3045 1949-3045
DOI:	10.1109/TAFFC.2024.3357656