Decoding Covert Speech from EEG Using a Functional Areas Spatio-Temporal Transformer
Covert speech involves imagining speaking without audible sound or any movements. Decoding covert speech from electroencephalogram (EEG) is challenging due to a limited understanding of neural pronunciation mapping and the low signal-to-noise ratio of the signal. In this study, we developed a large-...
Saved in:
Main Authors | , , , , , , , , , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
02.04.2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Covert speech involves imagining speaking without audible sound or any
movements. Decoding covert speech from electroencephalogram (EEG) is
challenging due to a limited understanding of neural pronunciation mapping and
the low signal-to-noise ratio of the signal. In this study, we developed a
large-scale multi-utterance speech EEG dataset from 57 right-handed native
English-speaking subjects, each performing covert and overt speech tasks by
repeating the same word in five utterances within a ten-second duration. Given
the spatio-temporal nature of the neural activation process during speech
pronunciation, we developed a Functional Areas Spatio-temporal Transformer
(FAST), an effective framework for converting EEG signals into tokens and
utilizing transformer architecture for sequence encoding. Our results reveal
distinct and interpretable speech neural features by the visualization of
FAST-generated activation maps across frontal and temporal brain regions with
each word being covertly spoken, providing new insights into the discriminative
features of the neural representation of covert speech. This is the first
report of such a study, which provides interpretable evidence for speech
decoding from EEG. The code for this work has been made public at
https://github.com/Jiang-Muyun/FAST |
---|---|
DOI: | 10.48550/arxiv.2504.03762 |