Multimodal Semi-Supervised Domain Adaptation Using Cross-Modal Learning and Joint Distribution Alignment for Cross-Subject Emotion Recognition

Multimodal physiological data from electroencephalogram (EEG) and eye movement (EM) signals have been shown to be useful in effectively recognizing human emotional states. Unfortunately, individual differences reduce the applicability of existing multimodal classifiers to new users, as low performan...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on instrumentation and measurement Vol. 74; pp. 1 - 12
Main Authors	Jimenez-Guarneros, Magdiel, Fuentes-Pineda, Gibran, Grande-Barreto, Jonas
Format	Journal Article
Language	English
Published	New York IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptation Alignment Brain modeling Data mining Deep learning electroencephalogram (EEG) Electroencephalography Emotion recognition Emotional factors Emotions eye movement (EM) Eye movements Feature extraction Knowledge transfer Learning Magnetic heads multimodal semi-supervised domain adaptation (SSDA) Neural networks Proposals Signal to noise ratio Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Multimodal physiological data from electroencephalogram (EEG) and eye movement (EM) signals have been shown to be useful in effectively recognizing human emotional states. Unfortunately, individual differences reduce the applicability of existing multimodal classifiers to new users, as low performance is usually observed. Indeed, existing works mainly focus on multimodal domain adaptation from a labeled source domain and unlabeled target domain to address the mentioned problem, transferring knowledge from known subjects to new one. However, a limited set of labeled target data has not been effectively exploited to enhance the knowledge transfer between subjects. In this article, we propose a multimodal semi-supervised domain adaptation (SSDA) method, called cross-modal learning and joint distribution alignment (CMJDA), to address the limitations of existing works, following three strategies: 1) discriminative features are exploited per modality through independent neural networks; 2) correlated features and consistent predictions are produced between modalities; and 3) marginal and conditional distributions are encouraged to be similar between the labeled source data, limited labeled target data, and abundant unlabeled target data. We conducted comparison experiments on two public benchmarks for emotion recognition, SEED-IV and SEED-V, using leave-one-out cross-validation (LOOCV). Our proposal achieves an average accuracy of 92.50%-96.13% across the three available sessions on SEED-IV and SEED-V, only including three labeled target samples per class from the first recorded trial.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9456 1557-9662
DOI:	10.1109/TIM.2025.3551924