Temporal aware Mixed Attention-based Convolution and Transformer Network for cross-subject EEG emotion recognition

Emotion recognition is crucial for human–computer interaction, and electroencephalography (EEG) stands out as a valuable tool for capturing and reflecting human emotions. In this study, we propose a hierarchical hybrid model called Mixed Attention-based Convolution and Transformer Network (MACTN). T...

Full description

Saved in:
Bibliographic Details
Published inComputers in biology and medicine Vol. 181; p. 108973
Main Authors Si, Xiaopeng, Huang, Dong, Liang, Zhen, Sun, Yulin, Huang, He, Liu, Qile, Yang, Zhuobin, Ming, Dong
Format Journal Article
LanguageEnglish
Published United States Elsevier Ltd 01.10.2024
Elsevier Limited
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Emotion recognition is crucial for human–computer interaction, and electroencephalography (EEG) stands out as a valuable tool for capturing and reflecting human emotions. In this study, we propose a hierarchical hybrid model called Mixed Attention-based Convolution and Transformer Network (MACTN). This model is designed to collectively capture both local and global temporal information and is inspired by insights from neuroscientific research on the temporal dynamics of emotions. First, we introduce depth-wise temporal convolution and separable convolution to extract local temporal features. Then, a self-attention-based transformer is used to integrate the sparse global emotional features. Besides, channel attention mechanism is designed to identify the most task-relevant channels, facilitating the capture of relationships between different channels and emotional states. Extensive experiments are conducted on three public datasets under both offline and online evaluation modes. In the multi-class cross-subject online evaluation using the THU-EP dataset, MACTN demonstrates an approximate 8% enhancement in 9-class emotion recognition accuracy in comparison to state-of-the-art methods. In the multi-class cross-subject offline evaluation using the DEAP and SEED datasets, a comparable performance is achieved solely based on the raw EEG signals, without the need for prior knowledge or transfer learning during the feature extraction and learning process. Furthermore, ablation studies have shown that integrating self-attention and channel-attention mechanisms improves classification performance. This method won the Emotional BCI Competition’s final championship in the World Robot Contest. The source code is available at https://github.com/ThreePoundUniverse/MACTN. •Efficient local and global temporal feature extraction network for EEG signals.•State-of-the-art multi-class emotion decoding performance.•Model interpretability analysis was conducted in the time dimension.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0010-4825
1879-0534
1879-0534
DOI:10.1016/j.compbiomed.2024.108973