ERTNet: an interpretable transformer-based framework for EEG emotion recognition

Emotion recognition using EEG signals enables clinicians to assess patients' emotional states with precision and immediacy. However, the complexity of EEG signal data poses challenges for traditional recognition methods. Deep learning techniques effectively capture the nuanced emotional cues wi...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in neuroscience Vol. 18; p. 1320645
Main Authors Liu, Ruixiang, Chao, Yihu, Ma, Xuerui, Sha, Xianzheng, Sun, Limin, Li, Shuo, Chang, Shijie
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Emotion recognition using EEG signals enables clinicians to assess patients' emotional states with precision and immediacy. However, the complexity of EEG signal data poses challenges for traditional recognition methods. Deep learning techniques effectively capture the nuanced emotional cues within these signals by leveraging extensive data. Nonetheless, most deep learning techniques lack interpretability while maintaining accuracy. We developed an interpretable end-to-end EEG emotion recognition framework rooted in the hybrid CNN and transformer architecture. Specifically, temporal convolution isolates salient information from EEG signals while filtering out potential high-frequency noise. Spatial convolution discerns the topological connections between channels. Subsequently, the transformer module processes the feature maps to integrate high-level spatiotemporal features, enabling the identification of the prevailing emotional state. Experiments' results demonstrated that our model excels in diverse emotion classification, achieving an accuracy of 74.23% ± 2.59% on the dimensional model (DEAP) and 67.17% ± 1.70% on the discrete model (SEED-V). These results surpass the performances of both CNN and LSTM-based counterparts. Through interpretive analysis, we ascertained that the beta and gamma bands in the EEG signals exert the most significant impact on emotion recognition performance. Notably, our model can independently tailor a Gaussian-like convolution kernel, effectively filtering high-frequency noise from the input EEG data. Given its robust performance and interpretative capabilities, our proposed framework is a promising tool for EEG-driven emotion brain-computer interface.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Man Fai Leung, Anglia Ruskin University, United Kingdom
Jiahui Pan, South China Normal University, China
Edited by: Zhao Lv, Anhui University, China
These authors have contributed equally to this work and share first authorship
ISSN:1662-4548
1662-453X
1662-453X
DOI:10.3389/fnins.2024.1320645