Knowledge Transferability Between the Speech Data of Persons With Dysarthria Speaking Different Languages for Dysarthric Speech Recognition

In this paper, we present an end-to-end speech recognition system for Japanese persons with articulation disorders resulting from athetoid cerebral palsy. Because their utterance is often unstable or unclear, speech recognition systems struggle to recognize their speech. Recent deep learning-based a...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 7; pp. 164320 - 164326
Main Authors Takashima, Yuki, Takashima, Ryoichi, Takiguchi, Tetsuya, Ariki, Yasuo
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.01.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2169-3536
2169-3536
DOI10.1109/ACCESS.2019.2951856

Cover

Abstract In this paper, we present an end-to-end speech recognition system for Japanese persons with articulation disorders resulting from athetoid cerebral palsy. Because their utterance is often unstable or unclear, speech recognition systems struggle to recognize their speech. Recent deep learning-based approaches have exhibited promising performance. However, these approaches require a large amount of training data, and it is difficult to collect sufficient data from such dysarthric people. This paper proposes a transfer learning method that transfers two types of knowledge corresponding to the different datasets: the language-dependent (phonetic and linguistic) characteristic of unimpaired speech and the language-independent characteristic of dysarthric speech. The former is obtained from Japanese non-dysarthric speech data, and the latter is obtained from non-Japanese dysarthric speech data. In the proposed method, we pre-train a model using Japanese non-dysarthric speech and non-Japanese dysarthric speech, and thereafter, we fine-tune the model using the target Japanese dysarthric speech. To handle the speech data of the two different languages in one model, we employ language-specific decoder modules. Experimental results indicate that our proposed approach can significantly improve speech recognition performance compared with other approaches that do not use additional speech data.
AbstractList In this paper, we present an end-to-end speech recognition system for Japanese persons with articulation disorders resulting from athetoid cerebral palsy. Because their utterance is often unstable or unclear, speech recognition systems struggle to recognize their speech. Recent deep learning-based approaches have exhibited promising performance. However, these approaches require a large amount of training data, and it is difficult to collect sufficient data from such dysarthric people. This paper proposes a transfer learning method that transfers two types of knowledge corresponding to the different datasets: the language-dependent (phonetic and linguistic) characteristic of unimpaired speech and the language-independent characteristic of dysarthric speech. The former is obtained from Japanese non-dysarthric speech data, and the latter is obtained from non-Japanese dysarthric speech data. In the proposed method, we pre-train a model using Japanese non-dysarthric speech and non-Japanese dysarthric speech, and thereafter, we fine-tune the model using the target Japanese dysarthric speech. To handle the speech data of the two different languages in one model, we employ language-specific decoder modules. Experimental results indicate that our proposed approach can significantly improve speech recognition performance compared with other approaches that do not use additional speech data.
Author Takashima, Ryoichi
Ariki, Yasuo
Takashima, Yuki
Takiguchi, Tetsuya
Author_xml – sequence: 1
  givenname: Yuki
  orcidid: 0000-0001-8489-9487
  surname: Takashima
  fullname: Takashima, Yuki
  email: takashima@kobe-u.ac.jp
  organization: Graduate School of System Informatics, Kobe University, Kobe, Japan
– sequence: 2
  givenname: Ryoichi
  orcidid: 0000-0002-9808-0250
  surname: Takashima
  fullname: Takashima, Ryoichi
  organization: Graduate School of System Informatics, Kobe University, Kobe, Japan
– sequence: 3
  givenname: Tetsuya
  orcidid: 0000-0001-5005-7679
  surname: Takiguchi
  fullname: Takiguchi, Tetsuya
  organization: Graduate School of System Informatics, Kobe University, Kobe, Japan
– sequence: 4
  givenname: Yasuo
  orcidid: 0000-0003-3473-2026
  surname: Ariki
  fullname: Ariki, Yasuo
  organization: Graduate School of System Informatics, Kobe University, Kobe, Japan
BookMark eNp9UU1vEzEUXKEiUUp_QS-WOCesveuPPZa00IpIIFLE0Xrrfd44BDvYjqr8Bv50HbYFxIF38dNoZjxP87I68cFjVV3Qek5p3b25XCyuV6s5q2k3Zx2niotn1Smjops1vBEnf-0vqvOUNnUZVSAuT6ufH3y43-IwIrmL4JPFCL3bunwgbzHfI3qS10hWO0SzJleQgQRLPmFMwSfy1eUCHhLEvI4OjjT45vxIrpwtTugzWYIf9zBiIjbEP1zzZPkZTRi9yy74V9VzC9uE54_vWfXl3fXd4ma2_Pj-dnG5nJlyXJ71irVSSEDedNZ2HVLVcCupVBRpXTMQjLJGIbQth9pKhgOClFL0tRmQNc1ZdTv5DgE2ehfdd4gHHcDpX0CIoy4hndmi5lIJ2YOpbfm0AwU99j0HUMYqM3BRvF5PXrsYfuwxZb0J--hLfM1azgUVraCF1U0sE0NKEa02LsPx5hzBbTWt9bFKPVWpj1XqxyqLtvlH-5T4_6qLSeUQ8bdCqY6VUM0DtkmuXQ
CODEN IAECCG
CitedBy_id crossref_primary_10_1109_ACCESS_2020_3023783
crossref_primary_10_1109_ACCESS_2024_3374874
crossref_primary_10_1016_j_icte_2021_07_004
crossref_primary_10_1016_j_aeue_2021_153698
crossref_primary_10_32604_cmc_2023_040024
crossref_primary_10_3390_electronics12204278
crossref_primary_10_1016_j_knosys_2023_110851
crossref_primary_10_2147_PRBM_S460283
crossref_primary_10_1016_j_specom_2023_02_004
crossref_primary_10_1186_s13636_023_00318_2
crossref_primary_10_1109_ACCESS_2023_3234110
crossref_primary_10_32604_cmc_2023_037380
Cites_doi 10.1109/EUSIPCO.2015.7362616
10.1109/ICASSP.2018.8461972
10.1109/ICASSP.2018.8462290
10.1109/ICASSP.2011.5947401
10.1109/TNSRE.2016.2638830
10.21437/Interspeech.2017-664
10.1109/ICASSP.2016.7472621
10.21437/Interspeech.2017-878
10.1109/TNSRE.2018.2802914
10.1109/MMSP.2010.5662075
10.1109/CVPR.2016.308
10.1007/s10579-011-9145-0
10.1109/ICASSP.2019.8683091
10.1109/ICASSP.2013.6639347
10.1109/TKDE.2009.191
10.1016/0167-6393(90)90011-W
10.1109/ICSDA.2011.6085978
10.1023/A:1007379606734
10.21437/Interspeech.2017-1318
10.1109/ICASSP.2019.8683803
10.21437/Interspeech.2008-583
10.21437/Interspeech.2018-1751
10.21437/Interspeech.2008-480
10.1109/ICSLP.1996.608020
10.1145/1143844.1143891
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
DOA
DOI 10.1109/ACCESS.2019.2951856
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
DatabaseTitleList Materials Research Database


Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 164326
ExternalDocumentID oai_doaj_org_article_57867bac0f8249a8abebb5aa8cf8cd56
10_1109_ACCESS_2019_2951856
8892556
Genre orig-research
GrantInformation_xml – fundername: Japan Society for the Promotion of Science
  grantid: JP17J04380
  funderid: 10.13039/501100001691
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ABAZT
ABVLG
ACGFS
ADBBV
AGSQL
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RNS
AAYXX
CITATION
RIG
7SC
7SP
7SR
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c518t-b824767ae539ff99e1835f71781e1002a621238ea445a0f72edea7776b0cde233
IEDL.DBID DOA
ISSN 2169-3536
IngestDate Wed Aug 27 01:24:46 EDT 2025
Mon Jun 30 03:53:27 EDT 2025
Tue Jul 01 01:21:50 EDT 2025
Thu Apr 24 22:53:56 EDT 2025
Wed Aug 27 02:44:45 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
License https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c518t-b824767ae539ff99e1835f71781e1002a621238ea445a0f72edea7776b0cde233
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-3473-2026
0000-0001-5005-7679
0000-0002-9808-0250
0000-0001-8489-9487
OpenAccessLink https://doaj.org/article/57867bac0f8249a8abebb5aa8cf8cd56
PQID 2455616461
PQPubID 4845423
PageCount 7
ParticipantIDs doaj_primary_oai_doaj_org_article_57867bac0f8249a8abebb5aa8cf8cd56
ieee_primary_8892556
crossref_citationtrail_10_1109_ACCESS_2019_2951856
crossref_primary_10_1109_ACCESS_2019_2951856
proquest_journals_2455616461
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2019-01-01
PublicationDateYYYYMMDD 2019-01-01
PublicationDate_xml – month: 01
  year: 2019
  text: 2019-01-01
  day: 01
PublicationDecade 2010
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2019
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref15
chorowski (ref27) 2015
ref30
ref11
ref32
ref10
ref17
vachhani (ref14) 2017
mohamed (ref3) 2009
aihara (ref19) 2017
ref18
goodfellow (ref20) 2014
ref24
ref23
ref26
ref25
rudzicz (ref2) 2010
ref22
wong (ref12) 2015
ref21
ref28
ref29
kingma (ref31) 2014
ref8
ref7
ref9
ref4
ref6
ref5
duffy (ref1) 2013
matsumasa (ref16) 2008
References_xml – ident: ref18
  doi: 10.1109/EUSIPCO.2015.7362616
– ident: ref25
  doi: 10.1109/ICASSP.2018.8461972
– ident: ref13
  doi: 10.1109/ICASSP.2018.8462290
– ident: ref4
  doi: 10.1109/ICASSP.2011.5947401
– start-page: 577
  year: 2015
  ident: ref27
  article-title: Attention-based models for speech recognition
  publication-title: Proc Neural Inf Process Syst
– ident: ref23
  doi: 10.1109/TNSRE.2016.2638830
– start-page: 3374
  year: 2017
  ident: ref19
  article-title: Phoneme-discriminative features for dysarthric speech conversion
  publication-title: Proc INTERSPEECH
  doi: 10.21437/Interspeech.2017-664
– ident: ref28
  doi: 10.1109/ICASSP.2016.7472621
– ident: ref22
  doi: 10.21437/Interspeech.2017-878
– ident: ref24
  doi: 10.1109/TNSRE.2018.2802914
– start-page: 70
  year: 2010
  ident: ref2
  article-title: Learning mixed acoustic/articulatory models for disabled speech
  publication-title: Proc Neural Inf Process Syst
– year: 2013
  ident: ref1
  publication-title: Motor Speech Disorders Substrates Differential Diagnosis and Management
– year: 2014
  ident: ref31
  article-title: Adam: A method for stochastic optimization
  publication-title: arXiv 1412 6980
– ident: ref17
  doi: 10.1109/MMSP.2010.5662075
– ident: ref32
  doi: 10.1109/CVPR.2016.308
– start-page: 329
  year: 2015
  ident: ref12
  article-title: Development of a Cantonese dysarthric speech corpus
  publication-title: Proc INTERSPEECH
– ident: ref10
  doi: 10.1007/s10579-011-9145-0
– ident: ref15
  doi: 10.1109/ICASSP.2019.8683091
– ident: ref5
  doi: 10.1109/ICASSP.2013.6639347
– ident: ref6
  doi: 10.1109/TKDE.2009.191
– ident: ref30
  doi: 10.1016/0167-6393(90)90011-W
– ident: ref11
  doi: 10.1109/ICSDA.2011.6085978
– ident: ref29
  doi: 10.1023/A:1007379606734
– start-page: 1854
  year: 2017
  ident: ref14
  article-title: Deep autoencoder based speech features for improved dysarthric speech recognition
  publication-title: Proc INTERSPEECH
  doi: 10.21437/Interspeech.2017-1318
– ident: ref7
  doi: 10.1109/ICASSP.2019.8683803
– start-page: 2234
  year: 2008
  ident: ref16
  article-title: Integration of metamodel and acoustic model for speech recognition
  publication-title: Proc INTERSPEECH
  doi: 10.21437/Interspeech.2008-583
– start-page: 2672
  year: 2014
  ident: ref20
  article-title: Generative adversarial nets
  publication-title: Proc Neural Inf Process Syst
– start-page: 39
  year: 2009
  ident: ref3
  article-title: Deep belief networks for phone recognition
  publication-title: Proc NIPS Workshop Deep Learn Speech Recognit Related Appl
– ident: ref21
  doi: 10.21437/Interspeech.2018-1751
– ident: ref9
  doi: 10.21437/Interspeech.2008-480
– ident: ref8
  doi: 10.1109/ICSLP.1996.608020
– ident: ref26
  doi: 10.1145/1143844.1143891
SSID ssj0000816957
Score 2.2865274
Snippet In this paper, we present an end-to-end speech recognition system for Japanese persons with articulation disorders resulting from athetoid cerebral palsy....
SourceID doaj
proquest
crossref
ieee
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 164320
SubjectTerms Acoustics
Assistive technology
Data models
Decoding
deep learning
dysarthria
end-to-end model
Hidden Markov models
Knowledge management
knowledge transfer
Languages
Machine learning
multilingual
Speech
speech processing
Speech recognition
Voice recognition
SummonAdditionalLinks – databaseName: IEEE Xplore
  dbid: RIE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwELaAU3voi1bdQisfeiRL4vh5hKUIQamqtqjcLD_GArXaRWz2AH-hf7q246SoVKi3yLItJzOZGY_H34fQ-4RSTg1lVdu2vKKhDZUCUVfKM8-9Idbm0_PTT_zojB6fs_M1tDPehQGAXHwG0_SYz_L9wq1SqmxXSpUQs9bRelSz_q7WmE9JBBKKiQIs1NRqd282i--QqrfUlMRAQiaS6jvOJ2P0F1KVe5Y4u5fDp-h0WFhfVfJjuurs1N3-hdn4vyt_hp6UOBPv9YrxHK3B_AV6fAd9cBP9OhnyaTh7rADXPWj3Dd7vq7dwjA7x1ysAd4EPTGfwIuDPOURf4u-XXWy8WUbVu4hKnLplYit8UChXOvyx5EKXOEbGf_q6YcovQ_HSYv4SnR1--DY7qgo3Q-Xil-wqKwkVXBhgrQpBKYimgYW4N5QNJFRXw5NPlGAoZaYOgoAHI4TgtnYeSNu-QhvzxRxeI2xJEK03TetivNB4owgRhjdBKEaUBz5BZBCadgW4PPFn_NR5A1Mr3UtaJ0nrIukJ2hkHXfW4HQ9330_aMHZNoNu5IUpRl39YR-PGhTWuDvHllZHGgrXMGOmCdD5NspkkP05ShD5B24Nu6WIglprQxEvKKW_e_HvUFnqUFthne7bRRne9grcx_unsu6z4vwHMRgPa
  priority: 102
  providerName: IEEE
Title Knowledge Transferability Between the Speech Data of Persons With Dysarthria Speaking Different Languages for Dysarthric Speech Recognition
URI https://ieeexplore.ieee.org/document/8892556
https://www.proquest.com/docview/2455616461
https://doaj.org/article/57867bac0f8249a8abebb5aa8cf8cd56
Volume 7
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT9wwELYQp3JAlIdYCsiHHhtI7Ph1hKUItQUhHoKbNX4JpGoXsemB38Cfxnac7UpI5dJrNHEynrFnxpl8H0JfE0p5Cy2rKKW8agMNlfKirpRjjjsgxuSv5-cX_Oy2_XHP7heovlJPWA8P3E9cLNglFwZsHWSsFECC8cYwAGmDtI5lsO1a1QvFVN6DZcMVEwVmqKnV4dF4HDVKvVzqgMS0QibK6oVQlBH7C8XKu305B5vTNbRaskR81L_dZ7TkJ-toZQE7cAO9_hxOw3CON8E_95DbL_i4773CMbfD10_e2wd8Ah3gacCXOcGe4bvHLl58mUX9H6ILJrFMS4VPCmFKh3-Vk8wZjnntX1k7DHk1tB5NJ5vo9vT7zfisKswKlY2ad5WJcym4AM-oCkEpHxc2C7Gyk41PmKzAU0STHtqWQR0E8c6DEIKb2jpPKN1Cy5PpxG8jbEgQ1EFDbYz2jQNFiADeBKEYUc7zESLDJGtbYMcT-8VvncuPWuneMjpZRhfLjNC3-U1PPerGv8WPk_XmogkyO1-IjqSLI-mPHGmENpLt54NIqRI82wjtDr6gy_KeadImVlHe8mbnfzz6C_qU1OlPdnbRcvf8x-_FXKcz-9mt9_NviW_xf_zW
linkProvider Directory of Open Access Journals
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELZKOUAPvArqQgEfODbbxPEjPrZbqoXuVgha0Zvlx1hFoN2qmz2Uv8CfxnacUAFC3CJrbNmZ8cx4PP4GoTcRpZxqyoq6rnlBfe0LCaIspGOOO02MSbfn81M-PafvL9jFBtob3sIAQEo-g3H8THf5bmnXMVS23zQyImbdQXeD3aese601RFRiCQnJRIYWqkq5fzCZhFXE_C05JsGVaGKZ6lvmJ6H057Iqf-jiZGCOH6J5P7Uur-TreN2asf3-G2rj_879EXqQPU180InGY7QBiydo6xb-4Db6cdJH1HCyWR6uO9juG3zY5W_h4B_iT1cA9hIf6VbjpccfkpO-wp-_tKHxZhWE7zKIcSRLpa3wUS660uJZjoaucPCNf9HafsiPffrScvEUnR-_PZtMi1ydobDhT7aFaQgVXGhgtfReSgjKgflwOmwqiLiumker2ICmlOnSCwIOtBCCm9I6IHX9DG0ulgvYQdgQL2qnq9oGj6FyWhIiNK-8kIxIB3yESM80ZTN0eayg8U2lI0wpVcdpFTmtMqdHaG_odNUhd_yb_DBKw0AaYbdTQ-CiyrtYBfXGhdG29GHxUjfagDFM68b6xro4yHbk_DBIZvoI7faypbKKWClCY2VSTnn1_O-9XqN707P5TM3enZ68QPfjZLvYzy7abK_X8DJ4Q615lTbBT4H4Byc
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Knowledge+Transferability+Between+the+Speech+Data+of+Persons+With+Dysarthria+Speaking+Different+Languages+for+Dysarthric+Speech+Recognition&rft.jtitle=IEEE+access&rft.au=Yuki+Takashima&rft.au=Ryoichi+Takashima&rft.au=Tetsuya+Takiguchi&rft.au=Yasuo+Ariki&rft.date=2019-01-01&rft.pub=IEEE&rft.eissn=2169-3536&rft.volume=7&rft.spage=164320&rft.epage=164326&rft_id=info:doi/10.1109%2FACCESS.2019.2951856&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_57867bac0f8249a8abebb5aa8cf8cd56
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon