Knowledge Transferability Between the Speech Data of Persons With Dysarthria Speaking Different Languages for Dysarthric Speech Recognition

In this paper, we present an end-to-end speech recognition system for Japanese persons with articulation disorders resulting from athetoid cerebral palsy. Because their utterance is often unstable or unclear, speech recognition systems struggle to recognize their speech. Recent deep learning-based a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 7; pp. 164320 - 164326
Main Authors	Takashima, Yuki, Takashima, Ryoichi, Takiguchi, Tetsuya, Ariki, Yasuo
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.01.2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Acoustics Assistive technology Data models Decoding deep learning dysarthria end-to-end model Hidden Markov models Knowledge management knowledge transfer Languages Machine learning multilingual Speech speech processing Speech recognition Voice recognition
Online Access	Get full text
ISSN	2169-3536 2169-3536
DOI	10.1109/ACCESS.2019.2951856

Cover

Abstract	In this paper, we present an end-to-end speech recognition system for Japanese persons with articulation disorders resulting from athetoid cerebral palsy. Because their utterance is often unstable or unclear, speech recognition systems struggle to recognize their speech. Recent deep learning-based approaches have exhibited promising performance. However, these approaches require a large amount of training data, and it is difficult to collect sufficient data from such dysarthric people. This paper proposes a transfer learning method that transfers two types of knowledge corresponding to the different datasets: the language-dependent (phonetic and linguistic) characteristic of unimpaired speech and the language-independent characteristic of dysarthric speech. The former is obtained from Japanese non-dysarthric speech data, and the latter is obtained from non-Japanese dysarthric speech data. In the proposed method, we pre-train a model using Japanese non-dysarthric speech and non-Japanese dysarthric speech, and thereafter, we fine-tune the model using the target Japanese dysarthric speech. To handle the speech data of the two different languages in one model, we employ language-specific decoder modules. Experimental results indicate that our proposed approach can significantly improve speech recognition performance compared with other approaches that do not use additional speech data.
AbstractList	In this paper, we present an end-to-end speech recognition system for Japanese persons with articulation disorders resulting from athetoid cerebral palsy. Because their utterance is often unstable or unclear, speech recognition systems struggle to recognize their speech. Recent deep learning-based approaches have exhibited promising performance. However, these approaches require a large amount of training data, and it is difficult to collect sufficient data from such dysarthric people. This paper proposes a transfer learning method that transfers two types of knowledge corresponding to the different datasets: the language-dependent (phonetic and linguistic) characteristic of unimpaired speech and the language-independent characteristic of dysarthric speech. The former is obtained from Japanese non-dysarthric speech data, and the latter is obtained from non-Japanese dysarthric speech data. In the proposed method, we pre-train a model using Japanese non-dysarthric speech and non-Japanese dysarthric speech, and thereafter, we fine-tune the model using the target Japanese dysarthric speech. To handle the speech data of the two different languages in one model, we employ language-specific decoder modules. Experimental results indicate that our proposed approach can significantly improve speech recognition performance compared with other approaches that do not use additional speech data.
Author	Takashima, Ryoichi Ariki, Yasuo Takashima, Yuki Takiguchi, Tetsuya
Author_xml	– sequence: 1 givenname: Yuki orcidid: 0000-0001-8489-9487 surname: Takashima fullname: Takashima, Yuki email: takashima@kobe-u.ac.jp organization: Graduate School of System Informatics, Kobe University, Kobe, Japan – sequence: 2 givenname: Ryoichi orcidid: 0000-0002-9808-0250 surname: Takashima fullname: Takashima, Ryoichi organization: Graduate School of System Informatics, Kobe University, Kobe, Japan – sequence: 3 givenname: Tetsuya orcidid: 0000-0001-5005-7679 surname: Takiguchi fullname: Takiguchi, Tetsuya organization: Graduate School of System Informatics, Kobe University, Kobe, Japan – sequence: 4 givenname: Yasuo orcidid: 0000-0003-3473-2026 surname: Ariki fullname: Ariki, Yasuo organization: Graduate School of System Informatics, Kobe University, Kobe, Japan
BookMark	eNp9UU1vEzEUXKEiUUp_QS-WOCesveuPPZa00IpIIFLE0Xrrfd44BDvYjqr8Bv50HbYFxIF38dNoZjxP87I68cFjVV3Qek5p3b25XCyuV6s5q2k3Zx2niotn1Smjops1vBEnf-0vqvOUNnUZVSAuT6ufH3y43-IwIrmL4JPFCL3bunwgbzHfI3qS10hWO0SzJleQgQRLPmFMwSfy1eUCHhLEvI4OjjT45vxIrpwtTugzWYIf9zBiIjbEP1zzZPkZTRi9yy74V9VzC9uE54_vWfXl3fXd4ma2_Pj-dnG5nJlyXJ71irVSSEDedNZ2HVLVcCupVBRpXTMQjLJGIbQth9pKhgOClFL0tRmQNc1ZdTv5DgE2ehfdd4gHHcDpX0CIoy4hndmi5lIJ2YOpbfm0AwU99j0HUMYqM3BRvF5PXrsYfuwxZb0J--hLfM1azgUVraCF1U0sE0NKEa02LsPx5hzBbTWt9bFKPVWpj1XqxyqLtvlH-5T4_6qLSeUQ8bdCqY6VUM0DtkmuXQ
CODEN	IAECCG
CitedBy_id	crossref_primary_10_1109_ACCESS_2020_3023783 crossref_primary_10_1109_ACCESS_2024_3374874 crossref_primary_10_1016_j_icte_2021_07_004 crossref_primary_10_1016_j_aeue_2021_153698 crossref_primary_10_32604_cmc_2023_040024 crossref_primary_10_3390_electronics12204278 crossref_primary_10_1016_j_knosys_2023_110851 crossref_primary_10_2147_PRBM_S460283 crossref_primary_10_1016_j_specom_2023_02_004 crossref_primary_10_1186_s13636_023_00318_2 crossref_primary_10_1109_ACCESS_2023_3234110 crossref_primary_10_32604_cmc_2023_037380
Cites_doi	10.1109/EUSIPCO.2015.7362616 10.1109/ICASSP.2018.8461972 10.1109/ICASSP.2018.8462290 10.1109/ICASSP.2011.5947401 10.1109/TNSRE.2016.2638830 10.21437/Interspeech.2017-664 10.1109/ICASSP.2016.7472621 10.21437/Interspeech.2017-878 10.1109/TNSRE.2018.2802914 10.1109/MMSP.2010.5662075 10.1109/CVPR.2016.308 10.1007/s10579-011-9145-0 10.1109/ICASSP.2019.8683091 10.1109/ICASSP.2013.6639347 10.1109/TKDE.2009.191 10.1016/0167-6393(90)90011-W 10.1109/ICSDA.2011.6085978 10.1023/A:1007379606734 10.21437/Interspeech.2017-1318 10.1109/ICASSP.2019.8683803 10.21437/Interspeech.2008-583 10.21437/Interspeech.2018-1751 10.21437/Interspeech.2008-480 10.1109/ICSLP.1996.608020 10.1145/1143844.1143891
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
DBID	97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA
DOI	10.1109/ACCESS.2019.2951856
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Xplore CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional DOAJ Directory of Open Access Journals
DatabaseTitle	CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional
DatabaseTitleList	Materials Research Database
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	2169-3536
EndPage	164326
ExternalDocumentID	oai_doaj_org_article_57867bac0f8249a8abebb5aa8cf8cd56 10_1109_ACCESS_2019_2951856 8892556
Genre	orig-research
GrantInformation_xml	– fundername: Japan Society for the Promotion of Science grantid: JP17J04380 funderid: 10.13039/501100001691
GroupedDBID	0R~ 4.4 5VS 6IK 97E AAJGR ABAZT ABVLG ACGFS ADBBV AGSQL ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RNS AAYXX CITATION RIG 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c518t-b824767ae539ff99e1835f71781e1002a621238ea445a0f72edea7776b0cde233
IEDL.DBID	DOA
ISSN	2169-3536
IngestDate	Wed Aug 27 01:24:46 EDT 2025 Mon Jun 30 03:53:27 EDT 2025 Tue Jul 01 01:21:50 EDT 2025 Thu Apr 24 22:53:56 EDT 2025 Wed Aug 27 02:44:45 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Language	English
License	https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c518t-b824767ae539ff99e1835f71781e1002a621238ea445a0f72edea7776b0cde233
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0003-3473-2026 0000-0001-5005-7679 0000-0002-9808-0250 0000-0001-8489-9487
OpenAccessLink	https://doaj.org/article/57867bac0f8249a8abebb5aa8cf8cd56
PQID	2455616461
PQPubID	4845423
PageCount	7
ParticipantIDs	doaj_primary_oai_doaj_org_article_57867bac0f8249a8abebb5aa8cf8cd56 ieee_primary_8892556 crossref_citationtrail_10_1109_ACCESS_2019_2951856 crossref_primary_10_1109_ACCESS_2019_2951856 proquest_journals_2455616461
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2019-01-01
PublicationDateYYYYMMDD	2019-01-01
PublicationDate_xml	– month: 01 year: 2019 text: 2019-01-01 day: 01
PublicationDecade	2010
PublicationPlace	Piscataway
PublicationPlace_xml	– name: Piscataway
PublicationTitle	IEEE access
PublicationTitleAbbrev	Access
PublicationYear	2019
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref15 chorowski (ref27) 2015 ref30 ref11 ref32 ref10 ref17 vachhani (ref14) 2017 mohamed (ref3) 2009 aihara (ref19) 2017 ref18 goodfellow (ref20) 2014 ref24 ref23 ref26 ref25 rudzicz (ref2) 2010 ref22 wong (ref12) 2015 ref21 ref28 ref29 kingma (ref31) 2014 ref8 ref7 ref9 ref4 ref6 ref5 duffy (ref1) 2013 matsumasa (ref16) 2008
References_xml	– ident: ref18 doi: 10.1109/EUSIPCO.2015.7362616 – ident: ref25 doi: 10.1109/ICASSP.2018.8461972 – ident: ref13 doi: 10.1109/ICASSP.2018.8462290 – ident: ref4 doi: 10.1109/ICASSP.2011.5947401 – start-page: 577 year: 2015 ident: ref27 article-title: Attention-based models for speech recognition publication-title: Proc Neural Inf Process Syst – ident: ref23 doi: 10.1109/TNSRE.2016.2638830 – start-page: 3374 year: 2017 ident: ref19 article-title: Phoneme-discriminative features for dysarthric speech conversion publication-title: Proc INTERSPEECH doi: 10.21437/Interspeech.2017-664 – ident: ref28 doi: 10.1109/ICASSP.2016.7472621 – ident: ref22 doi: 10.21437/Interspeech.2017-878 – ident: ref24 doi: 10.1109/TNSRE.2018.2802914 – start-page: 70 year: 2010 ident: ref2 article-title: Learning mixed acoustic/articulatory models for disabled speech publication-title: Proc Neural Inf Process Syst – year: 2013 ident: ref1 publication-title: Motor Speech Disorders Substrates Differential Diagnosis and Management – year: 2014 ident: ref31 article-title: Adam: A method for stochastic optimization publication-title: arXiv 1412 6980 – ident: ref17 doi: 10.1109/MMSP.2010.5662075 – ident: ref32 doi: 10.1109/CVPR.2016.308 – start-page: 329 year: 2015 ident: ref12 article-title: Development of a Cantonese dysarthric speech corpus publication-title: Proc INTERSPEECH – ident: ref10 doi: 10.1007/s10579-011-9145-0 – ident: ref15 doi: 10.1109/ICASSP.2019.8683091 – ident: ref5 doi: 10.1109/ICASSP.2013.6639347 – ident: ref6 doi: 10.1109/TKDE.2009.191 – ident: ref30 doi: 10.1016/0167-6393(90)90011-W – ident: ref11 doi: 10.1109/ICSDA.2011.6085978 – ident: ref29 doi: 10.1023/A:1007379606734 – start-page: 1854 year: 2017 ident: ref14 article-title: Deep autoencoder based speech features for improved dysarthric speech recognition publication-title: Proc INTERSPEECH doi: 10.21437/Interspeech.2017-1318 – ident: ref7 doi: 10.1109/ICASSP.2019.8683803 – start-page: 2234 year: 2008 ident: ref16 article-title: Integration of metamodel and acoustic model for speech recognition publication-title: Proc INTERSPEECH doi: 10.21437/Interspeech.2008-583 – start-page: 2672 year: 2014 ident: ref20 article-title: Generative adversarial nets publication-title: Proc Neural Inf Process Syst – start-page: 39 year: 2009 ident: ref3 article-title: Deep belief networks for phone recognition publication-title: Proc NIPS Workshop Deep Learn Speech Recognit Related Appl – ident: ref21 doi: 10.21437/Interspeech.2018-1751 – ident: ref9 doi: 10.21437/Interspeech.2008-480 – ident: ref8 doi: 10.1109/ICSLP.1996.608020 – ident: ref26 doi: 10.1145/1143844.1143891
SSID	ssj0000816957
Score	2.2865274
Snippet	In this paper, we present an end-to-end speech recognition system for Japanese persons with articulation disorders resulting from athetoid cerebral palsy....
SourceID	doaj proquest crossref ieee
SourceType	Open Website Aggregation Database Enrichment Source Index Database Publisher
StartPage	164320
SubjectTerms	Acoustics Assistive technology Data models Decoding deep learning dysarthria end-to-end model Hidden Markov models Knowledge management knowledge transfer Languages Machine learning multilingual Speech speech processing Speech recognition Voice recognition
SummonAdditionalLinks	– databaseName: IEEE Xplore dbid: RIE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwELaAU3voi1bdQisfeiRL4vh5hKUIQamqtqjcLD_GArXaRWz2AH-hf7q246SoVKi3yLItJzOZGY_H34fQ-4RSTg1lVdu2vKKhDZUCUVfKM8-9Idbm0_PTT_zojB6fs_M1tDPehQGAXHwG0_SYz_L9wq1SqmxXSpUQs9bRelSz_q7WmE9JBBKKiQIs1NRqd282i--QqrfUlMRAQiaS6jvOJ2P0F1KVe5Y4u5fDp-h0WFhfVfJjuurs1N3-hdn4vyt_hp6UOBPv9YrxHK3B_AV6fAd9cBP9OhnyaTh7rADXPWj3Dd7vq7dwjA7x1ysAd4EPTGfwIuDPOURf4u-XXWy8WUbVu4hKnLplYit8UChXOvyx5EKXOEbGf_q6YcovQ_HSYv4SnR1--DY7qgo3Q-Xil-wqKwkVXBhgrQpBKYimgYW4N5QNJFRXw5NPlGAoZaYOgoAHI4TgtnYeSNu-QhvzxRxeI2xJEK03TetivNB4owgRhjdBKEaUBz5BZBCadgW4PPFn_NR5A1Mr3UtaJ0nrIukJ2hkHXfW4HQ9330_aMHZNoNu5IUpRl39YR-PGhTWuDvHllZHGgrXMGOmCdD5NspkkP05ShD5B24Nu6WIglprQxEvKKW_e_HvUFnqUFthne7bRRne9grcx_unsu6z4vwHMRgPa priority: 102 providerName: IEEE
Title	Knowledge Transferability Between the Speech Data of Persons With Dysarthria Speaking Different Languages for Dysarthric Speech Recognition
URI	https://ieeexplore.ieee.org/document/8892556 https://www.proquest.com/docview/2455616461 https://doaj.org/article/57867bac0f8249a8abebb5aa8cf8cd56
Volume	7
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT9wwELYQp3JAlIdYCsiHHhtI7Ph1hKUItQUhHoKbNX4JpGoXsemB38Cfxnac7UpI5dJrNHEynrFnxpl8H0JfE0p5Cy2rKKW8agMNlfKirpRjjjsgxuSv5-cX_Oy2_XHP7heovlJPWA8P3E9cLNglFwZsHWSsFECC8cYwAGmDtI5lsO1a1QvFVN6DZcMVEwVmqKnV4dF4HDVKvVzqgMS0QibK6oVQlBH7C8XKu305B5vTNbRaskR81L_dZ7TkJ-toZQE7cAO9_hxOw3CON8E_95DbL_i4773CMbfD10_e2wd8Ah3gacCXOcGe4bvHLl58mUX9H6ILJrFMS4VPCmFKh3-Vk8wZjnntX1k7DHk1tB5NJ5vo9vT7zfisKswKlY2ad5WJcym4AM-oCkEpHxc2C7Gyk41PmKzAU0STHtqWQR0E8c6DEIKb2jpPKN1Cy5PpxG8jbEgQ1EFDbYz2jQNFiADeBKEYUc7zESLDJGtbYMcT-8VvncuPWuneMjpZRhfLjNC3-U1PPerGv8WPk_XmogkyO1-IjqSLI-mPHGmENpLt54NIqRI82wjtDr6gy_KeadImVlHe8mbnfzz6C_qU1OlPdnbRcvf8x-_FXKcz-9mt9_NviW_xf_zW
linkProvider	Directory of Open Access Journals
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELZKOUAPvArqQgEfODbbxPEjPrZbqoXuVgha0Zvlx1hFoN2qmz2Uv8CfxnacUAFC3CJrbNmZ8cx4PP4GoTcRpZxqyoq6rnlBfe0LCaIspGOOO02MSbfn81M-PafvL9jFBtob3sIAQEo-g3H8THf5bmnXMVS23zQyImbdQXeD3aese601RFRiCQnJRIYWqkq5fzCZhFXE_C05JsGVaGKZ6lvmJ6H057Iqf-jiZGCOH6J5P7Uur-TreN2asf3-G2rj_879EXqQPU180InGY7QBiydo6xb-4Db6cdJH1HCyWR6uO9juG3zY5W_h4B_iT1cA9hIf6VbjpccfkpO-wp-_tKHxZhWE7zKIcSRLpa3wUS660uJZjoaucPCNf9HafsiPffrScvEUnR-_PZtMi1ydobDhT7aFaQgVXGhgtfReSgjKgflwOmwqiLiumker2ICmlOnSCwIOtBCCm9I6IHX9DG0ulgvYQdgQL2qnq9oGj6FyWhIiNK-8kIxIB3yESM80ZTN0eayg8U2lI0wpVcdpFTmtMqdHaG_odNUhd_yb_DBKw0AaYbdTQ-CiyrtYBfXGhdG29GHxUjfagDFM68b6xro4yHbk_DBIZvoI7faypbKKWClCY2VSTnn1_O-9XqN707P5TM3enZ68QPfjZLvYzy7abK_X8DJ4Q615lTbBT4H4Byc
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Knowledge+Transferability+Between+the+Speech+Data+of+Persons+With+Dysarthria+Speaking+Different+Languages+for+Dysarthric+Speech+Recognition&rft.jtitle=IEEE+access&rft.au=Yuki+Takashima&rft.au=Ryoichi+Takashima&rft.au=Tetsuya+Takiguchi&rft.au=Yasuo+Ariki&rft.date=2019-01-01&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=7&rft.spage=164320&rft.epage=164326&rft_id=info:doi/10.1109%2FACCESS.2019.2951856&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_57867bac0f8249a8abebb5aa8cf8cd56
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon