Text Generation of Speech Imagery Based on an Enhanced CTA-BiLSTM Model Utilizing EEG Signals
Recent studies have demonstrated the potential application of speech imagery neural signals in brain-computer interface (BCI) technology. Text generation based on speech imagery offers a natural communication method for individuals with speech disabilities. However, the limitations in imagined conte...
Saved in:
Published in | IEEE transactions on consumer electronics Vol. 71; no. 2; pp. 3442 - 3453 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
IEEE
01.05.2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Recent studies have demonstrated the potential application of speech imagery neural signals in brain-computer interface (BCI) technology. Text generation based on speech imagery offers a natural communication method for individuals with speech disabilities. However, the limitations in imagined content and the immaturity of text generation technology currently constitute an obstacle to its applications. Therefore, this study proposes an enhanced CTA-BiLSTM model for efficient text generation utilizing speech imagery electroencephalography (EEG) signals, significantly enhancing the accuracy and fluency of text generation. Firstly, distinct from the prevailing imagination of characters and words, this study has assembled a sentence-level EEG dataset from ten subjects to facilitate communication. Subsequently, addressing the temporal dynamics characteristics and sequence dependencies of sentence signals, we employ dynamic time warping (DTW) and hidden Markov models (HMM) for accurate temporal alignment and signal annotation to generate fine-grained sentence labels. Finally, the proposed CTA-BiLSTM model leverages channel-time attention mechanism to dynamically adjust weights across channels and time, emphasizing critical features. Concurrently, the bidirectional long short-term memory (BiLSTM) network captures and utilizes long-term dependencies in the EEG signals, thereby enhancing the accuracy of the model in decoding complex temporal patterns. The experimental results demonstrate that the average sentence decoding accuracy can reach 67.50% on the self-built dataset, realizing a better evaluation accuracy and validating its potential for application. |
---|---|
ISSN: | 0098-3063 1558-4127 |
DOI: | 10.1109/TCE.2025.3557912 |