Multi-Modal Cross-Subject Emotion Feature Alignment and Recognition with EEG and Eye Movements

Multi-modal emotion recognition has attracted much attention in human-computer interaction, because it provides complementary information for the recognition model. However, the distribution drift among subjects and the heterogeneity of different modalities pose challenges to multi-modal emotion rec...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on affective computing Vol. 16; no. 3; pp. 1 - 15
Main Authors	Zhu, Qi, Zhu, Ting, Fei, Lunke, Zheng, Chuhang, Shao, Wei, Zhang, David, Zhang, Daoqiang
Format	Journal Article
Language	English
Published	Piscataway IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	adversarial domain adaptation Alignment Brain modeling Contrastive learning Correlation cross-subject Data mining Domains EEG Electroencephalography Emotion recognition Emotions Eye movements Feature extraction Heterogeneity Long short term memory Modal data multi-modal fusion Physiology Representations Training
Online Access	Get full text
ISSN	1949-3045 1949-3045
DOI	10.1109/TAFFC.2025.3554399

Cover

Loading…

More Information
Summary:	Multi-modal emotion recognition has attracted much attention in human-computer interaction, because it provides complementary information for the recognition model. However, the distribution drift among subjects and the heterogeneity of different modalities pose challenges to multi-modal emotion recognition, thereby limiting its practical application. Most of the current multi-modal emotion recognition methods are difficult to suppress above uncertainties in fusion. In this paper, we propose a cross-subject multi-modal emotion recognition framework, which jointly learns subject-independent representation and common feature between EEG and eye movements. Firstly, we design the dynamic adversarial domain adaptation for cross-subject distribution alignment, dynamically selecting source domains in training. Secondly, we simultaneously capture intra-modal and inter-modal emotion-related features by both self-attention and cross-attention mechanisms, thus obtaining the robust and complementary representation of emotional information. Then, two contrastive loss functions are imposed on above network to further reduce inter-modal heterogeneity, and mine higher-order semantic similarity between synchronously collected multi-modal data. Finally, we used the output of the softmax layer as the predicted value. The experimental results on several multi-modal emotion datasets with EEG and eye movements demonstrate that our method is significantly superior to the state-of-the-art emotion recognition approaches. Our code is available at: https://github.com/xbrainnet/CSMM .
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1949-3045 1949-3045
DOI:	10.1109/TAFFC.2025.3554399