Artificial neural networks as speech recognisers for dysarthric speech: Identifying the best-performing set of MFCC parameters and studying a speaker-independent approach
•The best performing set of MFCC parameters for dysarthric speech was studied.•A speaker-independent dysarthric ASR model based on ANNs is proposed.•The ASR systems trained by mel cepstrum with 12 coefficients provided the best accuracy.•The proposed speaker-independent ASR model provided 68.38% wor...
Saved in:
Published in | Advanced engineering informatics Vol. 28; no. 1; pp. 102 - 110 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.01.2014
|
Subjects | |
Online Access | Get full text |
ISSN | 1474-0346 |
DOI | 10.1016/j.aei.2014.01.001 |
Cover
Loading…
Abstract | •The best performing set of MFCC parameters for dysarthric speech was studied.•A speaker-independent dysarthric ASR model based on ANNs is proposed.•The ASR systems trained by mel cepstrum with 12 coefficients provided the best accuracy.•The proposed speaker-independent ASR model provided 68.38% word recognition rate.•The highest word recognition rate of the speaker-dependent ASR systems was 95%.
Dysarthria is a neurological impairment of controlling the motor speech articulators that compromises the speech signal. Automatic Speech Recognition (ASR) can be very helpful for speakers with dysarthria because the disabled persons are often physically incapacitated. Mel-Frequency Cepstral Coefficients (MFCCs) have been proven to be an appropriate representation of dysarthric speech, but the question of which MFCC-based feature set represents dysarthric acoustic features most effectively has not been answered. Moreover, most of the current dysarthric speech recognisers are either speaker-dependent (SD) or speaker-adaptive (SA), and they perform poorly in terms of generalisability as a speaker-independent (SI) model. First, by comparing the results of 28 dysarthric SD speech recognisers, this study identifies the best-performing set of MFCC parameters, which can represent dysarthric acoustic features to be used in Artificial Neural Network (ANN)-based ASR. Next, this paper studies the application of ANNs as a fixed-length isolated-word SI ASR for individuals who suffer from dysarthria. The results show that the speech recognisers trained by the conventional 12 coefficients MFCC features without the use of delta and acceleration features provided the best accuracy, and the proposed SI ASR recognised the speech of the unforeseen dysarthric evaluation subjects with word recognition rate of 68.38%. |
---|---|
AbstractList | •The best performing set of MFCC parameters for dysarthric speech was studied.•A speaker-independent dysarthric ASR model based on ANNs is proposed.•The ASR systems trained by mel cepstrum with 12 coefficients provided the best accuracy.•The proposed speaker-independent ASR model provided 68.38% word recognition rate.•The highest word recognition rate of the speaker-dependent ASR systems was 95%.
Dysarthria is a neurological impairment of controlling the motor speech articulators that compromises the speech signal. Automatic Speech Recognition (ASR) can be very helpful for speakers with dysarthria because the disabled persons are often physically incapacitated. Mel-Frequency Cepstral Coefficients (MFCCs) have been proven to be an appropriate representation of dysarthric speech, but the question of which MFCC-based feature set represents dysarthric acoustic features most effectively has not been answered. Moreover, most of the current dysarthric speech recognisers are either speaker-dependent (SD) or speaker-adaptive (SA), and they perform poorly in terms of generalisability as a speaker-independent (SI) model. First, by comparing the results of 28 dysarthric SD speech recognisers, this study identifies the best-performing set of MFCC parameters, which can represent dysarthric acoustic features to be used in Artificial Neural Network (ANN)-based ASR. Next, this paper studies the application of ANNs as a fixed-length isolated-word SI ASR for individuals who suffer from dysarthria. The results show that the speech recognisers trained by the conventional 12 coefficients MFCC features without the use of delta and acceleration features provided the best accuracy, and the proposed SI ASR recognised the speech of the unforeseen dysarthric evaluation subjects with word recognition rate of 68.38%. |
Author | Binti Salim, Siti Salwah Shahamiri, Seyed Reza |
Author_xml | – sequence: 1 givenname: Seyed Reza orcidid: 0000-0003-1543-5931 surname: Shahamiri fullname: Shahamiri, Seyed Reza email: admin@rezanet.com – sequence: 2 givenname: Siti Salwah surname: Binti Salim fullname: Binti Salim, Siti Salwah email: salwa@um.edu.my |
BookMark | eNp9kE1OwzAQhb0Aid8DsPMFEuw4SlJYVRWFSiA2sLam9oS6tHY0dkG9EqfEKaxYsBrp6X1Pmu-MHfngkbErKUopZHO9LgFdWQlZl0KWQsgjdirrti6EqpsTdhbjOodNN2lP2deUkuudcbDhHnd0OOkz0HvkEHkcEM2KE5rw5l1EirwPxO0-AqUVOfPbuOELiz4v7Z1_42mFfIkxFQNSrm_HLGLioedP89mMD0CwxTSugbc8pp09cDCuwTtS4bzFAf24yWEYKIBZXbDjHjYRL3_vOXud373MHorH5_vFbPpYmGrSpgJakFYY7Oq2a2yt-k5Ji0uwplIdgJBqWVfS2K5a9i3WapIxkyNlKkTVNOqctT-7hkKMhL02LkFywScCt9FS6FGzXuusWY-atZA6G82k_EMO5LZA-3-Z2x8G80sfDklH49AbtC5bT9oG9w_9DSQEnuc |
CitedBy_id | crossref_primary_10_1007_s13369_024_08919_5 crossref_primary_10_1155_2024_8890592 crossref_primary_10_1109_TNSRE_2021_3076778 crossref_primary_10_1007_s11135_016_0375_5 crossref_primary_10_1007_s00521_020_05672_2 crossref_primary_10_1007_s11042_020_09580_4 crossref_primary_10_1016_j_eswa_2017_08_015 crossref_primary_10_1109_TNSRE_2014_2309336 crossref_primary_10_1007_s00521_020_04793_y crossref_primary_10_1016_j_bbe_2015_11_004 crossref_primary_10_1186_s13636_023_00318_2 crossref_primary_10_1080_17549507_2018_1510033 crossref_primary_10_1109_TNSRE_2023_3331524 crossref_primary_10_1007_s11277_021_08899_x crossref_primary_10_1016_j_ins_2020_05_017 crossref_primary_10_1109_TNSRE_2016_2638830 crossref_primary_10_1007_s10772_018_9523_8 crossref_primary_10_1016_j_compeleceng_2019_03_011 crossref_primary_10_2196_44489 crossref_primary_10_1007_s10772_021_09808_0 crossref_primary_10_1016_j_bbe_2016_05_003 crossref_primary_10_1080_23311916_2020_1751557 crossref_primary_10_1007_s12559_022_10041_3 crossref_primary_10_1007_s13198_019_00863_0 crossref_primary_10_1016_j_csl_2019_05_002 crossref_primary_10_1016_j_cmpb_2021_106602 crossref_primary_10_1142_S0218488517500052 crossref_primary_10_1080_0952813X_2021_1948921 crossref_primary_10_1016_j_aei_2024_102608 crossref_primary_10_4015_S1016237215500209 crossref_primary_10_3390_app11062477 crossref_primary_10_1007_s00034_024_02770_7 crossref_primary_10_1016_j_measurement_2024_114515 crossref_primary_10_1044_2024_JSLHR_23_00740 crossref_primary_10_1177_1729881417719836 crossref_primary_10_1016_j_future_2023_08_002 crossref_primary_10_1108_LHT_02_2022_0091 crossref_primary_10_26599_BDMA_2022_9020017 crossref_primary_10_1007_s11042_020_08824_7 crossref_primary_10_1007_s12652_020_02764_8 crossref_primary_10_4218_etrij_2017_0260 crossref_primary_10_1007_s10772_021_09899_9 crossref_primary_10_1016_j_asoc_2022_108411 crossref_primary_10_1080_09544828_2024_2434210 crossref_primary_10_3390_electronics12204278 crossref_primary_10_1007_s12652_021_03542_w |
Cites_doi | 10.1016/S0925-2312(00)00308-8 10.1080/07434619512331277289 10.1080/07434610012331279044 10.1080/aac.17.4.265.275 10.1080/07434610012331278904 10.21437/Interspeech.2009-444 10.1109/TNSRE.2005.856074 10.1044/1092-4388(2011/10-0349) 10.1016/j.infsof.2011.02.006 10.1016/j.specom.2011.10.006 10.21437/Interspeech.2008-480 10.1016/j.medengphy.2005.11.002 10.1109/TBME.2003.820386 10.1109/ICASSP.2006.1660840 10.3109/07434618.2010.532508 10.1080/14015430802657216 10.1080/10400435.2010.483646 10.1016/j.medengphy.2006.06.009 10.21437/Eurospeech.2003-384 10.1682/JRRD.2004.06.0067 10.1155/2009/540409 10.1016/j.dsp.2009.10.004 10.1016/S0021-9924(00)00023-X 10.1016/0169-2607(91)90071-Z 10.21437/ICSLP.2002-217 |
ContentType | Journal Article |
Copyright | 2014 Elsevier Ltd |
Copyright_xml | – notice: 2014 Elsevier Ltd |
DBID | AAYXX CITATION |
DOI | 10.1016/j.aei.2014.01.001 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Applied Sciences |
EndPage | 110 |
ExternalDocumentID | 10_1016_j_aei_2014_01_001 S1474034614000020 |
GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1~. 1~5 23M 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ AAAKF AAAKG AACTN AAEDT AAEDW AAIKJ AAKOC AALRI AAOAW AAQFI AARIN AAXKI AAXUO AAYFN ABBOA ABFNM ABMAC ABUCO ABXDB ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFJKZ AFKWA AFTJW AGHFR AGUBO AGYEJ AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJOXV AKRWK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD APLSM AXJTR BJAXD BKOJK BLXMC CS3 EBS EFJIC EJD EO8 EO9 EP2 EP3 FDB FEDTE FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HAMUX HVGLF HZ~ IHE J1W JJJVA KOM M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 PC. Q38 RIG RNS ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SSB SSD SST SSV SSZ T5K UHS XPP ZMT ~G- AATTM AAYWO AAYXX ABJNI ABWVN ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFPUW AFXIZ AGCQF AGRNS AIGII AIIUN AKBMS AKYEP ANKPU APXCP BNPGV CITATION SSH |
ID | FETCH-LOGICAL-c297t-a7a1d0ce84786d43f831debadc238aa013b421cd82bf7e439c29cb423c2ee3663 |
IEDL.DBID | .~1 |
ISSN | 1474-0346 |
IngestDate | Tue Jul 01 02:02:35 EDT 2025 Thu Apr 24 23:11:06 EDT 2025 Fri Nov 22 06:49:09 EST 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Keywords | Mel-frequency cepstral coefficients Artificial neural network Dysarthria Automatic speech recognition |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c297t-a7a1d0ce84786d43f831debadc238aa013b421cd82bf7e439c29cb423c2ee3663 |
ORCID | 0000-0003-1543-5931 |
PageCount | 9 |
ParticipantIDs | crossref_citationtrail_10_1016_j_aei_2014_01_001 crossref_primary_10_1016_j_aei_2014_01_001 elsevier_sciencedirect_doi_10_1016_j_aei_2014_01_001 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | January 2014 2014-01-00 |
PublicationDateYYYYMMDD | 2014-01-01 |
PublicationDate_xml | – month: 01 year: 2014 text: January 2014 |
PublicationDecade | 2010 |
PublicationTitle | Advanced engineering informatics |
PublicationYear | 2014 |
Publisher | Elsevier Ltd |
Publisher_xml | – name: Elsevier Ltd |
References | Shahamiri, Kadir, Ibrahim, Hashim (b0170) 2011; 53 Selouani, Yakoub, O’Shaughnessy (b0015) 2009; 2009 H.V. Sharma, M. Hasegawa-Johnson, Universal access: speech recognition for talkers with spastic dysarthria, in: Proceedings of the 10th Annual Conference of the International Speech Communication Association 2009, Brighton, England, 2009, pp. 1447–1450. Wiśniewski, Kuniszyk-Jóźkowiak, Smołka, Suszyński (b0105) 2007 S.R. Shahamiri, S.S. Binti Salim, Real-time frequency-based noise-robust automatic speech recognition using multi-nets artificial neural networks: a multi-views multi-learners approach, Neurocomputing, in press Hawley, Enderby, Green, Cunningham, Brownsell, Carmichael, Parker, Hatzis, O’Neill, Palmer (b0050) 2007; 29 Kent (b0165) 2000; 33 Ferrier, Shane, Ballard, Carpenter, Benoit (b0040) 1995; 11 E. Sanders, M. Ruiter, L. Beijer, H. Strik, Automatic recognition of Dutch dysarthric speech: a pilot study, in: Proceedings of the 7th International Conference on Spoken Language Processing, Denver, CO, USA, 2002, pp. 661–664. M. Hasegawa-Johnson, J. Gunderson, A. Perlman, T. Huang, HMM-based and SVM-based recognition of the speech of talkers with spastic dysarthria, in: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2006, pp. 1060–1063. Raghavendra, Rosengren, Hunnicutt (b0155) 2001; 17 H. Kim, M. Hasegawa-Johnson, A. Perlman, J. Gunderson, T. Huang, K. Watkin, S. Frame, Dysarthric speech database for universal access research, in: Proc. of the 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, 2008, pp. 1741–1744. Polur, Miller (b0075) 2006; 28 Trentin, Gori (b0130) 2001; 37 Godino-Llorente, Gomez-Vilda (b0110) 2004; 51 P. Green, J. Carmichael, A. Hatzis, P. Enderby, M. Hawley, M. Parker, Automatic speech recognition with sparse training data for dysarthric speakers, in: Proceedings of the 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, pp. 1189–1192. Talbot (b0070) 2000; 41 Polur, Miller (b0010) 2005; 42 Jurafsky, Martin (b0100) 2008 Young, Mihailidis (b0045) 2010; 22 Bourlard, Morgan (b0085) 1994 Deller, Hsu, Ferrier (b0120) 1991; 35 Kitzing, Maier, Ahlander (b0030) 2009; 34 H.V. Sharma, M. Hasegawa-Johnson, State-transition interpolation and MAP adaptation for HMM-based dysarthric speech recognition, in: Proceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies, Association for Computational Linguistics, Los Angeles, CA, 2010, pp. 72–79. Borrie, McAuliffe, Liss (b0025) 2012; 55 Doyle, Leeper, Kotler, Thomas-Stonell, Oneill, Dylke, Rolls (b0035) 1997; 34 . Hux, Rankin-Erickson, Manasse, Lauritzen (b0060) 2000; 16 Polur, Miller (b0115) 2005; 13 Dede, Sazli (b0135) 2010; 20 Rudzicz (b0150) 2012; 54 Jayaram, Abdelhamied (b0145) 1995; 32 Fager, Beukelman, Jakobs, Hosom (b0055) 2010; 26 Morales, Cox (b0020) 2009; 2009 Rosen, Yampolsky (b0005) 2000; 16 Wiśniewski (10.1016/j.aei.2014.01.001_b0105) 2007 10.1016/j.aei.2014.01.001_b0095 Godino-Llorente (10.1016/j.aei.2014.01.001_b0110) 2004; 51 10.1016/j.aei.2014.01.001_b0090 Hawley (10.1016/j.aei.2014.01.001_b0050) 2007; 29 Jurafsky (10.1016/j.aei.2014.01.001_b0100) 2008 Shahamiri (10.1016/j.aei.2014.01.001_b0170) 2011; 53 Kent (10.1016/j.aei.2014.01.001_b0165) 2000; 33 Jayaram (10.1016/j.aei.2014.01.001_b0145) 1995; 32 Polur (10.1016/j.aei.2014.01.001_b0115) 2005; 13 Hux (10.1016/j.aei.2014.01.001_b0060) 2000; 16 Trentin (10.1016/j.aei.2014.01.001_b0130) 2001; 37 Selouani (10.1016/j.aei.2014.01.001_b0015) 2009; 2009 Talbot (10.1016/j.aei.2014.01.001_b0070) 2000; 41 Ferrier (10.1016/j.aei.2014.01.001_b0040) 1995; 11 Young (10.1016/j.aei.2014.01.001_b0045) 2010; 22 10.1016/j.aei.2014.01.001_b0160 Raghavendra (10.1016/j.aei.2014.01.001_b0155) 2001; 17 Kitzing (10.1016/j.aei.2014.01.001_b0030) 2009; 34 10.1016/j.aei.2014.01.001_b0080 Dede (10.1016/j.aei.2014.01.001_b0135) 2010; 20 Deller (10.1016/j.aei.2014.01.001_b0120) 1991; 35 10.1016/j.aei.2014.01.001_b0125 Fager (10.1016/j.aei.2014.01.001_b0055) 2010; 26 10.1016/j.aei.2014.01.001_b0065 10.1016/j.aei.2014.01.001_b0140 Polur (10.1016/j.aei.2014.01.001_b0010) 2005; 42 Morales (10.1016/j.aei.2014.01.001_b0020) 2009; 2009 Doyle (10.1016/j.aei.2014.01.001_b0035) 1997; 34 Bourlard (10.1016/j.aei.2014.01.001_b0085) 1994 Polur (10.1016/j.aei.2014.01.001_b0075) 2006; 28 Borrie (10.1016/j.aei.2014.01.001_b0025) 2012; 55 Rosen (10.1016/j.aei.2014.01.001_b0005) 2000; 16 Rudzicz (10.1016/j.aei.2014.01.001_b0150) 2012; 54 |
References_xml | – volume: 11 start-page: 165 year: 1995 end-page: 175 ident: b0040 article-title: Dysarthric speakers’ intelligibility and speech characteristics in relation to computer speech recognition publication-title: Augment. Altern. Comm. – volume: 28 start-page: 741 year: 2006 end-page: 748 ident: b0075 article-title: Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals publication-title: Med. Eng. Phys. – reference: S.R. Shahamiri, S.S. Binti Salim, Real-time frequency-based noise-robust automatic speech recognition using multi-nets artificial neural networks: a multi-views multi-learners approach, Neurocomputing, in press, < – reference: H.V. Sharma, M. Hasegawa-Johnson, State-transition interpolation and MAP adaptation for HMM-based dysarthric speech recognition, in: Proceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies, Association for Computational Linguistics, Los Angeles, CA, 2010, pp. 72–79. – volume: 16 start-page: 186 year: 2000 end-page: 196 ident: b0060 article-title: Accuracy of three speech recognition systems: case study of dysarthric speech publication-title: Augment. Altern. Comm. – volume: 54 start-page: 430 year: 2012 end-page: 444 ident: b0150 article-title: Using articulatory likelihoods in the recognition of dysarthric speech publication-title: Speech Commun. – volume: 29 start-page: 586 year: 2007 end-page: 593 ident: b0050 article-title: A speech-controlled environmental control system for people with severe dysarthria publication-title: Med. Eng. Phys. – volume: 37 start-page: 91 year: 2001 end-page: 126 ident: b0130 article-title: A survey of hybrid ANN/HMM models for automatic speech recognition publication-title: Neurocomputing – reference: H.V. Sharma, M. Hasegawa-Johnson, Universal access: speech recognition for talkers with spastic dysarthria, in: Proceedings of the 10th Annual Conference of the International Speech Communication Association 2009, Brighton, England, 2009, pp. 1447–1450. – year: 1994 ident: b0085 article-title: Connectionist Speech Recognition: A Hybrid Approach – volume: 2009 start-page: 1 year: 2009 end-page: 14 ident: b0020 article-title: Modelling errors in automatic speech recognition for dysarthric speakers publication-title: EURASIP J. Adv. Signal Process. – volume: 34 start-page: 309 year: 1997 end-page: 316 ident: b0035 article-title: Dysarthric speech: a comparison of computerized speech recognition and listener intelligibility publication-title: J. Rehabil. Res. Dev. – volume: 26 start-page: 267 year: 2010 end-page: 277 ident: b0055 article-title: Evaluation of a speech recognition prototype for speakers with moderate and severe dysarthria: a preliminary report publication-title: Augment. Altern. Comm. – volume: 53 start-page: 774 year: 2011 end-page: 788 ident: b0170 article-title: An automated framework for software test oracle publication-title: Inf. Softw. Technol. – volume: 17 start-page: 265 year: 2001 end-page: 275 ident: b0155 article-title: An investigation of different degrees of dysarthric speech as input to speaker-adaptive and speaker-dependent recognition systems publication-title: Augment. Altern. Comm. – volume: 35 start-page: 125 year: 1991 end-page: 139 ident: b0120 article-title: On the use of hidden Markov modeling for recognition of dysarthric speech publication-title: Comput. Meth. Prog. Biomed. – volume: 41 start-page: 31 year: 2000 end-page: 38 ident: b0070 article-title: Improving the speech recognition in the ENABL project publication-title: KTH TMH-QPSR – reference: M. Hasegawa-Johnson, J. Gunderson, A. Perlman, T. Huang, HMM-based and SVM-based recognition of the speech of talkers with spastic dysarthria, in: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2006, pp. 1060–1063. – reference: E. Sanders, M. Ruiter, L. Beijer, H. Strik, Automatic recognition of Dutch dysarthric speech: a pilot study, in: Proceedings of the 7th International Conference on Spoken Language Processing, Denver, CO, USA, 2002, pp. 661–664. – volume: 32 start-page: 162 year: 1995 end-page: 169 ident: b0145 article-title: Experiments in dysarthric speech recognition using artificial neural networks publication-title: J. Rehabil. Res. Dev. – volume: 20 start-page: 763 year: 2010 end-page: 768 ident: b0135 article-title: Speech recognition with artificial neural networks publication-title: Digit. Signal Process. – volume: 51 start-page: 380 year: 2004 end-page: 384 ident: b0110 article-title: Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors publication-title: IEEE Trans. Biomed. Eng. – volume: 22 start-page: 99 year: 2010 end-page: 112 ident: b0045 article-title: Difficulties in automatic speech recognition of dysarthric speakers and implications for speech-based applications used by the elderly: a literature review publication-title: Assist. Technol. – volume: 13 start-page: 558 year: 2005 end-page: 561 ident: b0115 article-title: Experiments with fast Fourier transform, linear predictive and cepstral coefficients in dysarthric speech recognition algorithms using hidden Markov model publication-title: IEEE Trans. Neural Syst. Rehabil. Eng. – reference: >. – volume: 55 start-page: 290 year: 2012 end-page: 305 ident: b0025 article-title: Perceptual learning of dysarthric speech: a review of experimental studies publication-title: J. Speech Lang. Hear. Res. – volume: 16 start-page: 48 year: 2000 end-page: 60 ident: b0005 article-title: Automatic speech recognition and a review of its functioning with dysarthric speech publication-title: Augment. Altern. Comm. – volume: 2009 start-page: 1 year: 2009 end-page: 12 ident: b0015 article-title: Alternative speech communication system for persons with severe speech disorders publication-title: EURASIP J. Adv. Signal Process. – volume: 34 start-page: 91 year: 2009 end-page: 96 ident: b0030 article-title: Automatic speech recognition (ASR) and its use as a tool for assessment or therapy of voice, speech, and language disorders publication-title: Logop. Phoniatr. Voco. – volume: 42 start-page: 363 year: 2005 end-page: 371 ident: b0010 article-title: Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a mel-cepstral stochastic model publication-title: J. Rehabil. Res. Dev. – start-page: 445 year: 2007 end-page: 453 ident: b0105 article-title: Automatic detection of disorders in a continuous speech with the hidden Markov models approach publication-title: Computer Recognition Systems 2 – volume: 33 start-page: 391 year: 2000 end-page: 428 ident: b0165 article-title: Research on speech motor control and its disorders: a review and prospective publication-title: J. Commun. Disord. – reference: P. Green, J. Carmichael, A. Hatzis, P. Enderby, M. Hawley, M. Parker, Automatic speech recognition with sparse training data for dysarthric speakers, in: Proceedings of the 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, pp. 1189–1192. – year: 2008 ident: b0100 article-title: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition – reference: H. Kim, M. Hasegawa-Johnson, A. Perlman, J. Gunderson, T. Huang, K. Watkin, S. Frame, Dysarthric speech database for universal access research, in: Proc. of the 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, 2008, pp. 1741–1744. – year: 1994 ident: 10.1016/j.aei.2014.01.001_b0085 – volume: 32 start-page: 162 year: 1995 ident: 10.1016/j.aei.2014.01.001_b0145 article-title: Experiments in dysarthric speech recognition using artificial neural networks publication-title: J. Rehabil. Res. Dev. – volume: 37 start-page: 91 year: 2001 ident: 10.1016/j.aei.2014.01.001_b0130 article-title: A survey of hybrid ANN/HMM models for automatic speech recognition publication-title: Neurocomputing doi: 10.1016/S0925-2312(00)00308-8 – volume: 11 start-page: 165 year: 1995 ident: 10.1016/j.aei.2014.01.001_b0040 article-title: Dysarthric speakers’ intelligibility and speech characteristics in relation to computer speech recognition publication-title: Augment. Altern. Comm. doi: 10.1080/07434619512331277289 – volume: 16 start-page: 186 year: 2000 ident: 10.1016/j.aei.2014.01.001_b0060 article-title: Accuracy of three speech recognition systems: case study of dysarthric speech publication-title: Augment. Altern. Comm. doi: 10.1080/07434610012331279044 – ident: 10.1016/j.aei.2014.01.001_b0095 – volume: 17 start-page: 265 year: 2001 ident: 10.1016/j.aei.2014.01.001_b0155 article-title: An investigation of different degrees of dysarthric speech as input to speaker-adaptive and speaker-dependent recognition systems publication-title: Augment. Altern. Comm. doi: 10.1080/aac.17.4.265.275 – volume: 16 start-page: 48 year: 2000 ident: 10.1016/j.aei.2014.01.001_b0005 article-title: Automatic speech recognition and a review of its functioning with dysarthric speech publication-title: Augment. Altern. Comm. doi: 10.1080/07434610012331278904 – ident: 10.1016/j.aei.2014.01.001_b0125 doi: 10.21437/Interspeech.2009-444 – volume: 13 start-page: 558 year: 2005 ident: 10.1016/j.aei.2014.01.001_b0115 article-title: Experiments with fast Fourier transform, linear predictive and cepstral coefficients in dysarthric speech recognition algorithms using hidden Markov model publication-title: IEEE Trans. Neural Syst. Rehabil. Eng. doi: 10.1109/TNSRE.2005.856074 – volume: 55 start-page: 290 year: 2012 ident: 10.1016/j.aei.2014.01.001_b0025 article-title: Perceptual learning of dysarthric speech: a review of experimental studies publication-title: J. Speech Lang. Hear. Res. doi: 10.1044/1092-4388(2011/10-0349) – volume: 53 start-page: 774 year: 2011 ident: 10.1016/j.aei.2014.01.001_b0170 article-title: An automated framework for software test oracle publication-title: Inf. Softw. Technol. doi: 10.1016/j.infsof.2011.02.006 – volume: 34 start-page: 309 year: 1997 ident: 10.1016/j.aei.2014.01.001_b0035 article-title: Dysarthric speech: a comparison of computerized speech recognition and listener intelligibility publication-title: J. Rehabil. Res. Dev. – volume: 54 start-page: 430 year: 2012 ident: 10.1016/j.aei.2014.01.001_b0150 article-title: Using articulatory likelihoods in the recognition of dysarthric speech publication-title: Speech Commun. doi: 10.1016/j.specom.2011.10.006 – year: 2008 ident: 10.1016/j.aei.2014.01.001_b0100 – ident: 10.1016/j.aei.2014.01.001_b0160 doi: 10.21437/Interspeech.2008-480 – volume: 28 start-page: 741 year: 2006 ident: 10.1016/j.aei.2014.01.001_b0075 article-title: Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals publication-title: Med. Eng. Phys. doi: 10.1016/j.medengphy.2005.11.002 – volume: 51 start-page: 380 year: 2004 ident: 10.1016/j.aei.2014.01.001_b0110 article-title: Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors publication-title: IEEE Trans. Biomed. Eng. doi: 10.1109/TBME.2003.820386 – volume: 2009 start-page: 1 year: 2009 ident: 10.1016/j.aei.2014.01.001_b0020 article-title: Modelling errors in automatic speech recognition for dysarthric speakers publication-title: EURASIP J. Adv. Signal Process. – ident: 10.1016/j.aei.2014.01.001_b0080 doi: 10.1109/ICASSP.2006.1660840 – start-page: 445 year: 2007 ident: 10.1016/j.aei.2014.01.001_b0105 article-title: Automatic detection of disorders in a continuous speech with the hidden Markov models approach – volume: 26 start-page: 267 year: 2010 ident: 10.1016/j.aei.2014.01.001_b0055 article-title: Evaluation of a speech recognition prototype for speakers with moderate and severe dysarthria: a preliminary report publication-title: Augment. Altern. Comm. doi: 10.3109/07434618.2010.532508 – volume: 41 start-page: 31 year: 2000 ident: 10.1016/j.aei.2014.01.001_b0070 article-title: Improving the speech recognition in the ENABL project publication-title: KTH TMH-QPSR – volume: 34 start-page: 91 year: 2009 ident: 10.1016/j.aei.2014.01.001_b0030 article-title: Automatic speech recognition (ASR) and its use as a tool for assessment or therapy of voice, speech, and language disorders publication-title: Logop. Phoniatr. Voco. doi: 10.1080/14015430802657216 – volume: 22 start-page: 99 year: 2010 ident: 10.1016/j.aei.2014.01.001_b0045 article-title: Difficulties in automatic speech recognition of dysarthric speakers and implications for speech-based applications used by the elderly: a literature review publication-title: Assist. Technol. doi: 10.1080/10400435.2010.483646 – volume: 29 start-page: 586 year: 2007 ident: 10.1016/j.aei.2014.01.001_b0050 article-title: A speech-controlled environmental control system for people with severe dysarthria publication-title: Med. Eng. Phys. doi: 10.1016/j.medengphy.2006.06.009 – ident: 10.1016/j.aei.2014.01.001_b0090 doi: 10.21437/Eurospeech.2003-384 – volume: 42 start-page: 363 year: 2005 ident: 10.1016/j.aei.2014.01.001_b0010 article-title: Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a mel-cepstral stochastic model publication-title: J. Rehabil. Res. Dev. doi: 10.1682/JRRD.2004.06.0067 – volume: 2009 start-page: 1 year: 2009 ident: 10.1016/j.aei.2014.01.001_b0015 article-title: Alternative speech communication system for persons with severe speech disorders publication-title: EURASIP J. Adv. Signal Process. doi: 10.1155/2009/540409 – volume: 20 start-page: 763 year: 2010 ident: 10.1016/j.aei.2014.01.001_b0135 article-title: Speech recognition with artificial neural networks publication-title: Digit. Signal Process. doi: 10.1016/j.dsp.2009.10.004 – volume: 33 start-page: 391 year: 2000 ident: 10.1016/j.aei.2014.01.001_b0165 article-title: Research on speech motor control and its disorders: a review and prospective publication-title: J. Commun. Disord. doi: 10.1016/S0021-9924(00)00023-X – volume: 35 start-page: 125 year: 1991 ident: 10.1016/j.aei.2014.01.001_b0120 article-title: On the use of hidden Markov modeling for recognition of dysarthric speech publication-title: Comput. Meth. Prog. Biomed. doi: 10.1016/0169-2607(91)90071-Z – ident: 10.1016/j.aei.2014.01.001_b0140 – ident: 10.1016/j.aei.2014.01.001_b0065 doi: 10.21437/ICSLP.2002-217 |
SSID | ssj0016897 |
Score | 2.304871 |
Snippet | •The best performing set of MFCC parameters for dysarthric speech was studied.•A speaker-independent dysarthric ASR model based on ANNs is proposed.•The ASR... |
SourceID | crossref elsevier |
SourceType | Enrichment Source Index Database Publisher |
StartPage | 102 |
SubjectTerms | Artificial neural network Automatic speech recognition Dysarthria Mel-frequency cepstral coefficients |
Title | Artificial neural networks as speech recognisers for dysarthric speech: Identifying the best-performing set of MFCC parameters and studying a speaker-independent approach |
URI | https://dx.doi.org/10.1016/j.aei.2014.01.001 |
Volume | 28 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELYQLCy8EW_dwIRkmsRukrKhiqqA2gWQ2CK_KsqjVCQMLPwgfiV3jlOBEAxMUZxzXnbuviTffcfYoRBOaKMzbnAzl6oteCfBF9fMZimiEaNcQtnIg2Hav5EXt-3bOdZtcmGIVhl8f-3TvbcOLa1wN1vT8bh1FctMRkJifIn8DzXKYMcmnNPH7zOaR5zmdYEV3MLJuvmz6Tleyo2J3SW9cmeoC_MjNn2JN70VthSAIpzW57LK5txkjS0H0AjhkSzX2QdZ1DIQQOKUfuGp3SWoEsqpc-YOAk-IUisBYSrYtxKv8g6dYLA4gTpl16c9AaJC0Bgv-LTOK6C20lXwPIJBr9sFEgx_IiINHmNiwWvUko2ivakH98LHs_K6FTS65Rvspnd23e3zUICBm6STVVxlKraRcRjB8tRKMcpFbJ1W1mCgVwrRo5ZJbGye6FHmENpgN4NNwiTOCcQym2x-8jxxWwxyMUqits5Jjg_dRDs3UmlnVUfiO1WcJtssam59YYI6ORXJeCwaGtp9gaNV0GgVUUxUvG12NOsyraU5_jKWzXgW3-ZXgaHj9247_-u2yxZprf5Us8fmq5dXt4_gpdIHfnYesIXT88v-8BPMq_GX |
linkProvider | Elsevier |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwELWAHuDCjtiZAyckq0nsJim3qqIqS3sBJG6Rt4qylIqEA7_EVzKTOBUIwYFTJGcmi-3MPMczbxg7FsIJbXTCDZ7mUrUEb0e4cE1sEiMaMcpFlI08GMb9W3lx17qbY906F4bCKr3tr2x6aa19S9P3ZnM6HjevQ5nIQEj0L0G5oTbPGsROhZO90Tm_7A9nmwlxWtVYQXlOCvXmZhnmpdyYArxkSd7pS8P8cE9fXE5vlS17rAid6nHW2JybrLMVjxvBf5X5BvsgiYoJAoifsjyU0d05qBzyqXPmHnyoEGVXAiJVsO85vug92kEvcQpV1m6Z-QQIDEGjy-DTKrWA2nJXwMsIBr1uF4gz_JliafAeEwslTS3JKLqaenSvfDyrsFtATV2-yW57ZzfdPvc1GLiJ2knBVaJCGxiHTiyNrRSjVITWaWUN-nqlEEBqGYXGppEeJQ7RDaoZbBImck4gnNliC5OXidtmkIpRFLR0Sox8aClaqZFKO6vaEpdVYRztsKDu-sx4gnKqk_GU1ZFoDxmOVkajlQUhRePtsJOZyrRi5_hLWNbjmX2bYhl6j9_Vdv-ndsQW-zeDq-zqfHi5x5boTPXnZp8tFK9v7gCxTKEP_Vz9BA9s9Eg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Artificial+neural+networks+as+speech+recognisers+for+dysarthric+speech%3A+Identifying+the+best-performing+set+of+MFCC+parameters+and+studying+a+speaker-independent+approach&rft.jtitle=Advanced+engineering+informatics&rft.au=Shahamiri%2C+Seyed+Reza&rft.au=Binti+Salim%2C+Siti+Salwah&rft.date=2014-01-01&rft.pub=Elsevier+Ltd&rft.issn=1474-0346&rft.volume=28&rft.issue=1&rft.spage=102&rft.epage=110&rft_id=info:doi/10.1016%2Fj.aei.2014.01.001&rft.externalDocID=S1474034614000020 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1474-0346&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1474-0346&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1474-0346&client=summon |