Joint low-rank tensor fusion and cross-modal attention for multimodal physiological signals based emotion recognition

Objective . Physiological signals based emotion recognition is a prominent research domain in the field of human-computer interaction. Previous studies predominantly focused on unimodal data, giving limited attention to the interplay among multiple modalities. Within the scope of multimodal emotion...

Full description

Saved in:
Bibliographic Details
Published inPhysiological measurement Vol. 45; no. 7; pp. 75003 - 75016
Main Authors Wan, Xin, Wang, Yongxiong, Wang, Zhe, Tang, Yiheng, Liu, Benke
Format Journal Article
LanguageEnglish
Published England IOP Publishing 01.07.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Objective . Physiological signals based emotion recognition is a prominent research domain in the field of human-computer interaction. Previous studies predominantly focused on unimodal data, giving limited attention to the interplay among multiple modalities. Within the scope of multimodal emotion recognition, integrating the information from diverse modalities and leveraging the complementary information are the two essential issues to obtain the robust representations. Approach . Thus, we propose a intermediate fusion strategy for combining low-rank tensor fusion with the cross-modal attention to enhance the fusion of electroencephalogram, electrooculogram, electromyography, and galvanic skin response. Firstly, handcrafted features from distinct modalities are individually fed to corresponding feature extractors to obtain latent features. Subsequently, low-rank tensor is fused to integrate the information by the modality interaction representation. Finally, a cross-modal attention module is employed to explore the potential relationships between the distinct latent features and modality interaction representation, and recalibrate the weights of different modalities. And the resultant representation is adopted for emotion recognition. Main results . Furthermore, to validate the effectiveness of the proposed method, we execute subject-independent experiments within the DEAP dataset. The proposed method has achieved the accuracies of 73.82% and 74.55% for valence and arousal classification. Significance . The results of extensive experiments verify the outstanding performance of the proposed method.
Bibliography:PMEA-105650.R1
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0967-3334
1361-6579
1361-6579
DOI:10.1088/1361-6579/ad5bbc