Two-stream Dynamic Heterogeneous Graph Recurrent Neural Network for Multi-label Multi-modal Emotion Recognition

The study of the relationship between emotions and physiological signals of subjects under multimedia stimulation is an emerging field, and many important advances are made. However, there are still some challenges: 1) How to effectively utilize the complementarity among spatial-spectral-temporal do...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on affective computing pp. 1 - 14
Main Authors	Wang, Jing, Feng, Zhiyang, Ning, Xiaojun, Lin, Youfang, Chen, Badong, Jia, Ziyu
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Brain modeling Correlation Data mining Deep learning Dynamic graph Electrocardiography Electroencephalography Emotion recognition Feature extraction Graph recurrent neural network Heterogeneous graph Multi-modal emotion recognition Physiology Robustness
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The study of the relationship between emotions and physiological signals of subjects under multimedia stimulation is an emerging field, and many important advances are made. However, there are still some challenges: 1) How to effectively utilize the complementarity among spatial-spectral-temporal domain information. 2) How to employ the heterogeneity and the correlation among multi-modal physiological signals simultaneously. 3) How to improve the robustness of the model dealing with missing channels. 4) How to model the dependency among different emotions. In this paper, we propose a novel two-stream Dynamic Heterogeneous Graph Recurrent Neural Network called DHGRNN. Specifically, DHGRNN consists of a spatial-temporal stream, a spatial-spectral stream, a fusion layer, and a multi-label classifier. Each stream is composed of a graph transformer network, evolved graph convolutional neural network, and gated recurrent units. We propose a graph-based two-stream structure to fuse the information of the spatial-spectral-temporal domain simultaneously. Graph transformer network and evolved graph convolutional neural network are used to model the heterogeneity and correlation of multi-modal physiological signals, respectively. To deal with the problem of robustness in the face of missing channel data, we transform it into the problem of dynamic graphs and use a dynamic graph neural network to improve the robustness. In addition, we propose a multi-label classifier to model the dependency among different emotion dimensions. Experiments on three public datasets demonstrate that our proposed model outperforms existing state-of-the-art methods.
ISSN:	1949-3045 1949-3045
DOI:	10.1109/TAFFC.2025.3561439