Multimodal Semi-Supervised Domain Adaptation Using Cross-Modal Learning and Joint Distribution Alignment for Cross-Subject Emotion Recognition
Multimodal physiological data from electroencephalogram (EEG) and eye movement (EM) signals have been shown to be useful in effectively recognizing human emotional states. Unfortunately, individual differences reduce the applicability of existing multimodal classifiers to new users, as low performan...
Saved in:
Published in | IEEE transactions on instrumentation and measurement Vol. 74; pp. 1 - 12 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Multimodal physiological data from electroencephalogram (EEG) and eye movement (EM) signals have been shown to be useful in effectively recognizing human emotional states. Unfortunately, individual differences reduce the applicability of existing multimodal classifiers to new users, as low performance is usually observed. Indeed, existing works mainly focus on multimodal domain adaptation from a labeled source domain and unlabeled target domain to address the mentioned problem, transferring knowledge from known subjects to new one. However, a limited set of labeled target data has not been effectively exploited to enhance the knowledge transfer between subjects. In this article, we propose a multimodal semi-supervised domain adaptation (SSDA) method, called cross-modal learning and joint distribution alignment (CMJDA), to address the limitations of existing works, following three strategies: 1) discriminative features are exploited per modality through independent neural networks; 2) correlated features and consistent predictions are produced between modalities; and 3) marginal and conditional distributions are encouraged to be similar between the labeled source data, limited labeled target data, and abundant unlabeled target data. We conducted comparison experiments on two public benchmarks for emotion recognition, SEED-IV and SEED-V, using leave-one-out cross-validation (LOOCV). Our proposal achieves an average accuracy of 92.50%-96.13% across the three available sessions on SEED-IV and SEED-V, only including three labeled target samples per class from the first recorded trial. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0018-9456 1557-9662 |
DOI: | 10.1109/TIM.2025.3551924 |