Text Generation of Speech Imagery Based on an Enhanced CTA-BiLSTM Model Utilizing EEG Signals

Recent studies have demonstrated the potential application of speech imagery neural signals in brain-computer interface (BCI) technology. Text generation based on speech imagery offers a natural communication method for individuals with speech disabilities. However, the limitations in imagined conte...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on consumer electronics Vol. 71; no. 2; pp. 3442 - 3453
Main Authors	Pan, Hongguang, Chu, Xin, Miao, Rui, Wang, Mei, Wang, Yiran, Li, Zhuoyi
Format	Journal Article
Language	English
Published	IEEE 01.05.2025
Subjects	Accuracy attention mechanism BCI BiLSTM Brain modeling decode Decoding EEG Electrodes Electroencephalography Feature extraction Hidden Markov models Labeling Signal processing algorithms Speech enhancement speech imagery text generation
Online Access	Get full text

Cover

Loading…

Abstract	Recent studies have demonstrated the potential application of speech imagery neural signals in brain-computer interface (BCI) technology. Text generation based on speech imagery offers a natural communication method for individuals with speech disabilities. However, the limitations in imagined content and the immaturity of text generation technology currently constitute an obstacle to its applications. Therefore, this study proposes an enhanced CTA-BiLSTM model for efficient text generation utilizing speech imagery electroencephalography (EEG) signals, significantly enhancing the accuracy and fluency of text generation. Firstly, distinct from the prevailing imagination of characters and words, this study has assembled a sentence-level EEG dataset from ten subjects to facilitate communication. Subsequently, addressing the temporal dynamics characteristics and sequence dependencies of sentence signals, we employ dynamic time warping (DTW) and hidden Markov models (HMM) for accurate temporal alignment and signal annotation to generate fine-grained sentence labels. Finally, the proposed CTA-BiLSTM model leverages channel-time attention mechanism to dynamically adjust weights across channels and time, emphasizing critical features. Concurrently, the bidirectional long short-term memory (BiLSTM) network captures and utilizes long-term dependencies in the EEG signals, thereby enhancing the accuracy of the model in decoding complex temporal patterns. The experimental results demonstrate that the average sentence decoding accuracy can reach 67.50% on the self-built dataset, realizing a better evaluation accuracy and validating its potential for application.
AbstractList	Recent studies have demonstrated the potential application of speech imagery neural signals in brain-computer interface (BCI) technology. Text generation based on speech imagery offers a natural communication method for individuals with speech disabilities. However, the limitations in imagined content and the immaturity of text generation technology currently constitute an obstacle to its applications. Therefore, this study proposes an enhanced CTA-BiLSTM model for efficient text generation utilizing speech imagery electroencephalography (EEG) signals, significantly enhancing the accuracy and fluency of text generation. Firstly, distinct from the prevailing imagination of characters and words, this study has assembled a sentence-level EEG dataset from ten subjects to facilitate communication. Subsequently, addressing the temporal dynamics characteristics and sequence dependencies of sentence signals, we employ dynamic time warping (DTW) and hidden Markov models (HMM) for accurate temporal alignment and signal annotation to generate fine-grained sentence labels. Finally, the proposed CTA-BiLSTM model leverages channel-time attention mechanism to dynamically adjust weights across channels and time, emphasizing critical features. Concurrently, the bidirectional long short-term memory (BiLSTM) network captures and utilizes long-term dependencies in the EEG signals, thereby enhancing the accuracy of the model in decoding complex temporal patterns. The experimental results demonstrate that the average sentence decoding accuracy can reach 67.50% on the self-built dataset, realizing a better evaluation accuracy and validating its potential for application.
Author	Chu, Xin Pan, Hongguang Wang, Mei Wang, Yiran Miao, Rui Li, Zhuoyi
Author_xml	– sequence: 1 givenname: Hongguang orcidid: 0000-0002-0390-6188 surname: Pan fullname: Pan, Hongguang email: hongguangpan@163.com organization: College of Electrical and Control Engineering and the Xi'an Key Laboratory of Electrical Equipment Condition Monitoring and Power Supply Security, Xi'an University of Science and Technology, Xi'an, China – sequence: 2 givenname: Xin surname: Chu fullname: Chu, Xin email: chuxin_029@163.com organization: College of Electrical and Control Engineering and the Xi'an Key Laboratory of Electrical Equipment Condition Monitoring and Power Supply Security, Xi'an University of Science and Technology, Xi'an, China – sequence: 3 givenname: Rui surname: Miao fullname: Miao, Rui email: miaor@hqvt.net organization: Research and Development Department, Shenzhen HQVT Technology Company LTD., Shenzhen, China – sequence: 4 givenname: Mei orcidid: 0000-0001-7834-5517 surname: Wang fullname: Wang, Mei email: wangm@xust.edu.cn organization: College of Computer Science and Technology, Xi'an University of Science and Technology, Xi'an, China – sequence: 5 givenname: Yiran surname: Wang fullname: Wang, Yiran email: yi_ran_wang@163.com organization: College of Electrical and Control Engineering and the Xi'an Key Laboratory of Electrical Equipment Condition Monitoring and Power Supply Security, Xi'an University of Science and Technology, Xi'an, China – sequence: 6 givenname: Zhuoyi surname: Li fullname: Li, Zhuoyi email: zhuoyilee@163.com organization: School of Automation, Northwestern Polytechnic University, Xi'an, China
BookMark	eNpNkDtvwjAUha2KSgXavUMH_4HQe2M7cUaIUooE6kAYq8j4Aa7AQUmG0l9PEAydjo7OY_hGZBDqYAl5RZggQvZe5sUkhlhMmBBphvEDGaIQMuIYpwMyBMhkxCBhT2TUtj8AyEUsh-S7tL8dndtgG9X5OtDa0fXJWr2ni6Pa2eZMZ6q1hvaRCrQIexV0b_NyGs38cl2u6Ko29kA3nT_4Px92tCjmdO13QR3aZ_LoerEvdx2TzUdR5p_R8mu-yKfLSCPjXZSiFsI44E7HPE3QWaYQUBqdSTROOUi3CrR0bCu1RG62DsCkwBODBiWyMYHbr27qtm2sq06NP6rmXCFUVzxVj6e64qnuePrJ223irbX_6hnPEszYBdDKYa8
CODEN	ITCEDA
Cites_doi	10.1016/j.neuroscience.2023.12.001 10.1075/ijcl.2.1.07ray 10.1007/s11571-022-09819-w 10.1080/2326263X.2019.1698928 10.1109/tcss.2024.3462823 10.3389/fnins.2020.00290 10.1109/TIM.2023.3300473 10.1109/TCE.2023.3330423 10.1007/s11042-023-15664-8 10.1016/j.bspc.2021.103241 10.1109/TNSRE.2003.810426 10.1109/TNSRE.2021.3111689 10.1109/TNNLS.2018.2789927 10.1109/TCE.2024.3368569 10.1002/hbm.25136 10.1016/j.neuron.2019.10.020 10.1109/TNSRE.2023.3241846 10.1109/TBME.2024.3376603 10.1126/science.aaa5417 10.1016/j.neunet.2020.05.032 10.1109/TNSRE.2021.3070327 10.1016/j.asoc.2013.10.023 10.1038/s41467-022-33611-3 10.1109/TNSRE.2022.3149654 10.1038/nature11076 10.1016/j.eswa.2016.04.011 10.1016/j.neucom.2016.01.007 10.1007/s10489-022-04226-4 10.1109/TCE.2024.3370310 10.1016/j.csl.2015.05.005
ContentType	Journal Article
DBID	97E RIA RIE AAYXX CITATION
DOI	10.1109/TCE.2025.3557912
DatabaseName	IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1558-4127
EndPage	3453
ExternalDocumentID	10_1109_TCE_2025_3557912 10949619
Genre	orig-research
GrantInformation_xml	– fundername: Science Research Program of Shaanxi Educational Committee grantid: 23JC049 – fundername: Xi’an Science and Technology Program grantid: 23ZDCYJSGG0025-2022 – fundername: Shaanxi Province Qin Chuangyuan “Scientists + Engineers” Team Construction grantid: 2022KXJ-38
GroupedDBID	-~X .DC 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABJNI ABQJQ ABVLG ACGFO ACIWK ACKIV ACNCT AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IBMZZ ICLAB IFIPE IFJZH IPLJI JAVBF LAI MS~ O9- OCL P2P RIA RIE RNS TAE TN5 VH1 AAYXX CITATION
ID	FETCH-LOGICAL-c134t-71c55df04fc24761fe3a1018dc981dfaf07ba0c8f3b8c814dbf00d7046d1d1813
IEDL.DBID	RIE
ISSN	0098-3063
IngestDate	Thu Aug 21 00:34:58 EDT 2025 Wed Aug 27 07:36:57 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	2
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c134t-71c55df04fc24761fe3a1018dc981dfaf07ba0c8f3b8c814dbf00d7046d1d1813
ORCID	0000-0002-0390-6188 0000-0001-7834-5517
PageCount	12
ParticipantIDs	crossref_primary_10_1109_TCE_2025_3557912 ieee_primary_10949619
PublicationCentury	2000
PublicationDate	2025-May
PublicationDateYYYYMMDD	2025-05-01
PublicationDate_xml	– month: 05 year: 2025 text: 2025-May
PublicationDecade	2020
PublicationTitle	IEEE transactions on consumer electronics
PublicationTitleAbbrev	T-CE
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
References	Li (ref5) 2017; 38 ref13 ref12 ref15 ref14 ref31 ref30 ref11 ref10 ref2 ref1 ref17 ref16 ref19 ref18 ref24 ref23 ref26 ref25 ref20 ref22 ref21 ref28 ref27 ref29 ref8 ref7 ref9 ref4 ref3 ref6
References_xml	– ident: ref6 doi: 10.1016/j.neuroscience.2023.12.001 – ident: ref26 doi: 10.1075/ijcl.2.1.07ray – ident: ref11 doi: 10.1007/s11571-022-09819-w – ident: ref24 doi: 10.1080/2326263X.2019.1698928 – ident: ref21 doi: 10.1109/tcss.2024.3462823 – ident: ref25 doi: 10.3389/fnins.2020.00290 – ident: ref31 doi: 10.1109/TIM.2023.3300473 – ident: ref7 doi: 10.1109/TCE.2023.3330423 – ident: ref30 doi: 10.1007/s11042-023-15664-8 – ident: ref19 doi: 10.1016/j.bspc.2021.103241 – ident: ref9 doi: 10.1109/TNSRE.2003.810426 – ident: ref14 doi: 10.1109/TNSRE.2021.3111689 – ident: ref20 doi: 10.1109/TNNLS.2018.2789927 – ident: ref1 doi: 10.1109/TCE.2024.3368569 – ident: ref29 doi: 10.1002/hbm.25136 – ident: ref27 doi: 10.1016/j.neuron.2019.10.020 – ident: ref17 doi: 10.1109/TNSRE.2023.3241846 – ident: ref8 doi: 10.1109/TBME.2024.3376603 – ident: ref3 doi: 10.1126/science.aaa5417 – ident: ref18 doi: 10.1016/j.neunet.2020.05.032 – ident: ref15 doi: 10.1109/TNSRE.2021.3070327 – ident: ref12 doi: 10.1016/j.asoc.2013.10.023 – ident: ref22 doi: 10.1038/s41467-022-33611-3 – ident: ref16 doi: 10.1109/TNSRE.2022.3149654 – volume: 38 start-page: 1353 issue: 6 year: 2017 ident: ref5 article-title: Electrical somatosensory based P300 for a brain-computer interface system publication-title: Chin. J. Sci. Instrum. – ident: ref2 doi: 10.1038/nature11076 – ident: ref13 doi: 10.1016/j.eswa.2016.04.011 – ident: ref23 doi: 10.1016/j.neucom.2016.01.007 – ident: ref4 doi: 10.1007/s10489-022-04226-4 – ident: ref10 doi: 10.1109/TCE.2024.3370310 – ident: ref28 doi: 10.1016/j.csl.2015.05.005
SSID	ssj0014528
Score	2.4320574
Snippet	Recent studies have demonstrated the potential application of speech imagery neural signals in brain-computer interface (BCI) technology. Text generation based...
SourceID	crossref ieee
SourceType	Index Database Publisher
StartPage	3442
SubjectTerms	Accuracy attention mechanism BCI BiLSTM Brain modeling decode Decoding EEG Electrodes Electroencephalography Feature extraction Hidden Markov models Labeling Signal processing algorithms Speech enhancement speech imagery text generation
Title	Text Generation of Speech Imagery Based on an Enhanced CTA-BiLSTM Model Utilizing EEG Signals
URI	https://ieeexplore.ieee.org/document/10949619
Volume	71
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELagEww8iygveWBhSLATO07GtkopiHZpKnVBUfyiFSWtUDrQX4-dNKggIbFFThxZd07uO9_ddwDcYh1qrih1ZBT6xkEJlMMFtiyiLOReRgIU2gLnwTDoj8nThE42xeplLYxSqkw-U669LGP5ciFW9qjMfOERiQJL8rlrPLeqWOs7ZECoF9YEmQYH-3VMEkX3STc2nqBHXWNcWYS9HzZoq6lKaVN6h2BYr6ZKJXlzVwV3xfoXUeO_l3sEDjboErar7XAMdlR-Ava3OAdPwUtifsewYpu2SoELDUdLpcQUPr5bQotP2DGWTUJzK8thnE_LHAHYTdpOZ_Y8SgbQ9k-bw3Exm8_W5p0wjh_gaPZqmZibYNyLk27f2fRYcAT2SeEwLCiVGhEtPMICrJWfWRIvKSKDZHWmEeMZEqH2eShCTCTXCElmvGqJpUEH_hlo5ItcnQMotTBgxLLj8IBISTKEBdeYCc4YFYK2wF0t9XRZUWmkpQuCotRoKLUaSjcaaoGmlefWc5UoL_4YvwR7dnqViXgFGsXHSl0btFDwm3KXfAEg6LpL
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1NT9tAEB3xcSg9UApUhbawBzj04LBr78b2gQOkhgQSLnEkLpXxfpUIcBA4quC_8Ff62zprJygg9YjUm2WvRlrPWO-Nd-YNwA6zkZVGCE_HUYAJStN4UjGnIhpG0s95k0auwbl31mwP-Mm5OJ-Dp-deGGNMVXxmGu6yOsvXIzV2v8rwC495jIx_UkN5ah5-Y4Z2v9_5ge7c9f2jJG21vckQAU-xgJdeyJQQ2lJulc8xZ7cmyJ1KlVYxUjWbWxrKnKrIBjJSEeNaWkp1iGmjZhrhL0C787CIREP4dXvY8yEFF340leRE5h1MT0FpvJe2Esw9fdFAOA9j5r9AvZkxLhWKHX2AP9P918UrV41xKRvq8ZU05H_7glZgecKfyUEd8B9hzhSr8H5GVXENfqYIOKTW03ZhR0aW9G-NUZekc-MkOx7IIWK3JvgoL0hSXFZVEKSVHniHw24_7RE3Ie6aDMrh9fARbZIkOSb94S-nNb0OgzfZ3ydYKEaF-QxEW4V0y-n_yCbXmueUKWlZqGQYCqXEBnyfejm7rcVCsirJonGGEZG5iMgmEbEB685_M-tq123-4_42vGunvW7W7ZydfoElZ6quu_wKC-Xd2HxDblTKrSpCCVy8tcf_ApvnGFc
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Text+Generation+of+Speech+Imagery+Based+on+an+Enhanced+CTA-BiLSTM+Model+Utilizing+EEG+Signals&rft.jtitle=IEEE+transactions+on+consumer+electronics&rft.au=Pan%2C+Hongguang&rft.au=Chu%2C+Xin&rft.au=Miao%2C+Rui&rft.au=Wang%2C+Mei&rft.date=2025-05-01&rft.pub=IEEE&rft.issn=0098-3063&rft.volume=71&rft.issue=2&rft.spage=3442&rft.epage=3453&rft_id=info:doi/10.1109%2FTCE.2025.3557912&rft.externalDocID=10949619
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0098-3063&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0098-3063&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0098-3063&client=summon