ST-SHAP: A hierarchical and explainable attention network for emotional EEG representation learning and decoding

Emotion recognition using electroencephalogram (EEG) has become a research hotspot in the field of human–computer interaction, how to sufficiently learn complex spatial–temporal representations of emotional EEG data and obtain explainable model prediction results are still great challenges. In this...

Full description

Saved in:
Bibliographic Details
Published inJournal of neuroscience methods Vol. 414; p. 110317
Main Authors Miao, Minmin, Liang, Jin, Sheng, Zhenzhen, Liu, Wenzhe, Xu, Baoguo, Hu, Wenjun
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier B.V 01.02.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Emotion recognition using electroencephalogram (EEG) has become a research hotspot in the field of human–computer interaction, how to sufficiently learn complex spatial–temporal representations of emotional EEG data and obtain explainable model prediction results are still great challenges. In this study, a novel hierarchical and explainable attention network ST-SHAP which combines the Swin Transformer (ST) and SHapley Additive exPlanations (SHAP) technique is proposed for automatic emotional EEG classification. Firstly, a 3D spatial–temporal feature of emotional EEG data is generated via frequency band filtering, temporal segmentation, spatial mapping, and interpolation to fully preserve important spatial–temporal-frequency characteristics. Secondly, a hierarchical attention network is devised to sufficiently learn an abstract spatial–temporal representation of emotional EEG and perform classification. Concretely, in this decoding model, the W-MSA module is used for modeling correlations within local windows, the SW-MSA module allows for information interactions between different local windows, and the patch merging module further facilitates local-to-global multiscale modeling. Finally, the SHAP method is utilized to discover important brain regions for emotion processing and improve the explainability of the Swin Transformer model. Two benchmark datasets, namely SEED and DREAMER, are used for classification performance evaluation. In the subject-dependent experiments, for SEED dataset, ST-SHAP achieves an average accuracy of 97.18%, while for DREAMER dataset, the average accuracy is 96.06% and 95.98% on arousal and valence dimension respectively. In addition, important brain regions that conform to prior knowledge of neurophysiology are discovered via a data-driven approach for both datasets. In terms of subject-dependent and subject-independent emotional EEG decoding accuracies, our method outperforms several closely related existing methods. These experimental results fully prove the effectiveness and superiority of our proposed algorithm. •A novel hierarchical attention network is designed for emotional EEG recognition.•We utilize global and local relationships of EEG for accurate emotion recognition.•SHAP algorithm is used to detect critical brain regions in emotion processing.•Extensive experiments demonstrate the effectiveness of the proposed ST-SHAP.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0165-0270
1872-678X
1872-678X
DOI:10.1016/j.jneumeth.2024.110317