Automatic speech recognition: a survey

Recently great strides have been made in the field of automatic speech recognition (ASR) by using various deep learning techniques. In this study, we present a thorough comparison between cutting-edged techniques currently being used in this area, with a special focus on the various deep learning me...

Full description

Saved in:
Bibliographic Details
Published inMultimedia tools and applications Vol. 80; no. 6; pp. 9411 - 9457
Main Authors Malik, Mishaim, Malik, Muhammad Kamran, Mehmood, Khawar, Makhdoom, Imran
Format Journal Article
LanguageEnglish
Published New York Springer US 01.03.2021
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Recently great strides have been made in the field of automatic speech recognition (ASR) by using various deep learning techniques. In this study, we present a thorough comparison between cutting-edged techniques currently being used in this area, with a special focus on the various deep learning methods. This study explores different feature extraction methods, state-of-the-art classification models, and vis-a-vis their impact on an ASR. As deep learning techniques are very data-dependent different speech datasets that are available online are also discussed in detail. In the end, the various online toolkits, resources, and language models that can be helpful in the formulation of an ASR are also proffered. In this study, we captured every aspect that can impact the performance of an ASR. Hence, we speculate that this work is a good starting point for academics interested in ASR research.
AbstractList Recently great strides have been made in the field of automatic speech recognition (ASR) by using various deep learning techniques. In this study, we present a thorough comparison between cutting-edged techniques currently being used in this area, with a special focus on the various deep learning methods. This study explores different feature extraction methods, state-of-the-art classification models, and vis-a-vis their impact on an ASR. As deep learning techniques are very data-dependent different speech datasets that are available online are also discussed in detail. In the end, the various online toolkits, resources, and language models that can be helpful in the formulation of an ASR are also proffered. In this study, we captured every aspect that can impact the performance of an ASR. Hence, we speculate that this work is a good starting point for academics interested in ASR research.
Author Malik, Mishaim
Makhdoom, Imran
Malik, Muhammad Kamran
Mehmood, Khawar
Author_xml – sequence: 1
  givenname: Mishaim
  orcidid: 0000-0002-4917-7144
  surname: Malik
  fullname: Malik, Mishaim
  email: mishaimmalik30@gmail.com
  organization: Punjab University College of Information Technology (PUCIT)
– sequence: 2
  givenname: Muhammad Kamran
  surname: Malik
  fullname: Malik, Muhammad Kamran
  organization: Faculty of Punjab University College of Information Technology (PUCIT)
– sequence: 3
  givenname: Khawar
  surname: Mehmood
  fullname: Mehmood, Khawar
  organization: School of Engineering and Information Technology, University of New South Wales (UNSW) Canberra at ADFA
– sequence: 4
  givenname: Imran
  surname: Makhdoom
  fullname: Makhdoom, Imran
  organization: Faculty of Engineering and IT, University of Technology Sydney
BookMark eNp9kE9LAzEQxYNUsFa_gKcFwVt0kuzu7HorxX9Q8KLnkGaTuqXd1CQr9NubdQXBQ08zA-83896ck0nnOkPIFYNbBoB3gTHIOQUOdJgFxRMyZcXQIGeT1IsKKBbAzsh5CBsAVhY8n5KbeR_dTsVWZ2FvjP7IvNFu3bWxdd19prLQ-y9zuCCnVm2DufytM_L--PC2eKbL16eXxXxJtWB1pCsO1Uo3RVNWSqBgVjQCy0YboVQyImpYlZCsVoUqbMMRLdSokyS3Fi0qMSPX4969d5-9CVFuXO-7dFLyvM6RlVVeJhUfVdq7ELyxcu_bnfIHyUAO-eX4D5n-8TMLiQmq_kG6jWqIGb1qt8dRMaIh3enWxv-5OkJ9A4J7dI0
CitedBy_id crossref_primary_10_1044_2023_JSLHR_22_00642
crossref_primary_10_1007_s11042_023_16748_1
crossref_primary_10_3390_app12031091
crossref_primary_10_1016_j_entcom_2024_100787
crossref_primary_10_3390_electronics11121829
crossref_primary_10_1109_TASLP_2022_3198555
crossref_primary_10_1007_s11042_022_13249_5
crossref_primary_10_1016_j_eswa_2022_118943
crossref_primary_10_1109_TSC_2023_3304312
crossref_primary_10_1080_03772063_2024_2315588
crossref_primary_10_1007_s00521_022_07234_0
crossref_primary_10_3390_electronics14010128
crossref_primary_10_1016_j_ins_2024_120802
crossref_primary_10_3390_jimaging9040082
crossref_primary_10_1007_s11042_023_15413_x
crossref_primary_10_1080_08839514_2022_2095039
crossref_primary_10_3390_info15100608
crossref_primary_10_1007_s10489_023_04669_3
crossref_primary_10_5351_KJAS_2023_36_1_033
crossref_primary_10_1016_j_bspc_2023_105595
crossref_primary_10_3390_data6120130
crossref_primary_10_32604_cmes_2022_021755
crossref_primary_10_1007_s10772_023_10033_0
crossref_primary_10_1371_journal_pone_0314898
crossref_primary_10_1007_s11042_021_11706_1
crossref_primary_10_3390_s24113289
crossref_primary_10_1007_s12559_023_10122_x
crossref_primary_10_3390_biomedinformatics4010047
crossref_primary_10_1007_s13198_023_01995_0
crossref_primary_10_1109_ACCESS_2023_3325402
crossref_primary_10_3390_bdcc7030132
crossref_primary_10_4018_IJSI_303576
crossref_primary_10_1016_j_dsp_2021_103134
crossref_primary_10_3390_fi16030087
crossref_primary_10_3390_drones7030147
crossref_primary_10_3389_fenrg_2024_1376677
crossref_primary_10_3390_electronics14020345
crossref_primary_10_3390_s24072345
crossref_primary_10_3390_s23187879
crossref_primary_10_3390_s22030923
crossref_primary_10_1155_2023_9959015
crossref_primary_10_1021_acsnano_4c12884
crossref_primary_10_3390_s23010062
crossref_primary_10_1016_j_csl_2024_101754
crossref_primary_10_1371_journal_pone_0275479
crossref_primary_10_3934_mbe_2024272
crossref_primary_10_1109_TAFFC_2024_3395117
crossref_primary_10_1016_j_ijar_2024_109301
crossref_primary_10_1007_s11227_024_06351_y
crossref_primary_10_1109_TSE_2023_3285280
crossref_primary_10_1108_LHT_09_2021_0333
crossref_primary_10_3390_math11183814
crossref_primary_10_1007_s11042_024_18753_4
crossref_primary_10_1016_j_csl_2022_101442
crossref_primary_10_3390_s22186966
crossref_primary_10_1109_TASLP_2024_3374064
crossref_primary_10_3390_info14020137
crossref_primary_10_1109_TAFFC_2022_3221749
crossref_primary_10_1007_s00521_023_08306_5
crossref_primary_10_3389_frsip_2022_999457
crossref_primary_10_3390_app14188532
crossref_primary_10_1044_2024_AJSLP_24_00218
crossref_primary_10_1109_ACCESS_2023_3255982
crossref_primary_10_1016_j_neunet_2024_106976
crossref_primary_10_51574_ijrer_v1i2_390
crossref_primary_10_3390_app112411957
crossref_primary_10_14778_3681954_3681998
crossref_primary_10_3390_e25010124
crossref_primary_10_1007_s10462_023_10668_0
crossref_primary_10_3390_app14041325
crossref_primary_10_1016_j_csi_2024_103856
crossref_primary_10_1145_3643830
crossref_primary_10_3390_s22166304
crossref_primary_10_1016_j_inffus_2024_102840
crossref_primary_10_3389_fphy_2024_1404503
crossref_primary_10_1186_s13636_021_00213_8
crossref_primary_10_1186_s13636_024_00388_w
crossref_primary_10_1007_s00034_024_02794_z
crossref_primary_10_1109_TR_2023_3298685
crossref_primary_10_2196_49132
crossref_primary_10_1016_j_cosrev_2023_100614
crossref_primary_10_1007_s41870_024_02285_z
crossref_primary_10_3390_s22083027
crossref_primary_10_1016_j_ijcce_2024_12_007
crossref_primary_10_1016_j_neucom_2023_126436
crossref_primary_10_1155_2022_7593750
crossref_primary_10_1016_j_ins_2024_121420
crossref_primary_10_1007_s11042_023_16554_9
crossref_primary_10_1016_j_apacoust_2022_108813
crossref_primary_10_1016_j_knosys_2023_110851
crossref_primary_10_3389_fncom_2022_980613
crossref_primary_10_1007_s13198_023_02014_y
crossref_primary_10_3390_s21155025
crossref_primary_10_3389_fdata_2023_1210559
crossref_primary_10_47495_okufbed_1457532
crossref_primary_10_3390_app122211727
crossref_primary_10_1016_j_chaos_2023_113554
crossref_primary_10_3390_s23020970
crossref_primary_10_1016_j_psep_2023_07_059
crossref_primary_10_1109_JSAC_2023_3280966
crossref_primary_10_1371_journal_pone_0302394
crossref_primary_10_3389_fcomm_2022_803452
crossref_primary_10_29407_intensif_v9i1_23723
crossref_primary_10_3390_app14124973
crossref_primary_10_1007_s11042_023_17080_4
crossref_primary_10_1007_s13042_025_02529_9
crossref_primary_10_3390_electronics13214227
crossref_primary_10_1016_j_inffus_2024_102422
crossref_primary_10_1007_s40747_024_01506_z
crossref_primary_10_1121_10_0035829
crossref_primary_10_1007_s11277_024_11448_x
crossref_primary_10_1109_ACCESS_2024_3471183
crossref_primary_10_1016_j_rineng_2025_103943
crossref_primary_10_3390_ai6040065
crossref_primary_10_1021_acsphotonics_4c01284
crossref_primary_10_1155_2024_4976944
crossref_primary_10_5937_telfor2401008B
Cites_doi 10.1109/TFUZZ.2010.2042721
10.3115/1075434.1075467
10.1109/MELCON.2010.5476306
10.1109/5.237532
10.1109/TASL.2009.2035151
10.1007/s10772-010-9088-7
10.1016/S0925-2312(00)00308-8
10.21437/ICSLP.1996-544
10.7551/mitpress/7503.003.0161
10.1109/AMS.2009.101
10.1109/89.701359
10.1029/JB076i008p01905
10.1090/S0002-9904-1967-11751-8
10.1006/csla.1993.1007
10.1109/TNN.2003.820838
10.1109/IJCNN.1998.682377
10.1109/ICASSP.1999.759734
10.1007/s10462-020-09825-6
10.1109/SPED.2011.5940729
10.1109/ICASSP.1987.1169748
10.1109/ICASSP.2016.7472621
10.1121/1.1907653
10.1186/1687-4722-2012-7
10.1109/LSP.2009.2024113
10.1109/TCOM.1981.1095031
10.1007/s10579-008-9076-6
10.1109/ICICS.2003.1292740
10.1007/978-3-540-71505-4_11
10.1016/j.patrec.2011.01.017
10.1109/TASL.2006.879805
10.1109/29.32278
10.1109/89.326616
10.1109/IAMA.2009.5228022
10.1109/CCOMS.2019.8821629
10.1109/JPROC.2003.817117
10.1109/ICASSP.2011.5947563
10.7763/IJCTE.2010.V2.262
10.1109/ICAPR.2009.80
10.1121/1.399423
10.21236/ADA458711
10.1109/IJCNN.2006.247398
10.3844/jcssp.2007.608.616
10.1109/AISP.2011.5960989
10.1109/ICECTECH.2011.5941788
10.1109/5.18626
10.1109/ICASSP.2011.5947489
10.1109/ICSDA.2017.8384449
10.1109/MC.2006.401
10.1145/1143844.1143891
10.1109/72.991432
10.1109/ICASSP.2018.8462105
10.1109/APCIP.2009.212
10.3115/100964.101006
10.1109/MWSCAS.2002.1187258
10.1109/ICASSP.2015.7178964
10.1109/IADCC.2009.4808998
10.21437/Interspeech.2009-604
10.1109/TASL.2011.2129510
10.1016/j.dsp.2010.07.004
10.1109/ICASSP.1995.479276
10.1109/34.192463
10.1007/11760023_23
10.21437/Interspeech.2005-237
10.1109/MWSCAS.2003.1562377
10.1007/3-540-45065-3_33
10.1109/MELCON.2010.5476361
10.1007/978-3-540-30549-1_116
10.21437/Eurospeech.2001-396
10.1109/ELMAR.2006.329528
10.1145/2500887
10.1109/PACCS.2009.138
10.1109/TASSP.1978.1163055
10.2478/jaiscr-2019-0006
10.5120/21581-4672
10.1109/TIE.2011.2164773
10.1109/ASRU46091.2019.9004036
10.3115/1075527.1075552
10.1109/TASL.2012.2221459
10.1109/ICASSP.2010.5495097
10.1109/ICASSP40776.2020.9053889
10.3390/sym11050644
10.1080/00401706.1991.10484833
10.1109/ICNC.2008.666
10.1016/S0020-7373(70)80008-6
10.21437/Interspeech.2019-1341
10.1109/ICASSP.1991.150344
10.1007/BF00337288
10.1007/978-0-387-76569-3_1
10.1109/ICCCNT.2010.5591733
10.1049/iet-spr.2012.0151
10.1109/ICASSP.1990.115720
10.1016/j.specom.2013.07.008
10.1117/12.836711
10.1142/S0218488501001253
10.1109/45.1890
10.1109/FSKD.2011.6019893
10.1109/ICASSP.2001.940770
10.1121/1.1906946
10.3115/116580.116683
10.1109/ICSAP.2010.21
10.1109/ICASSP.2005.1415166
10.1109/ICSMC.2011.6083880
10.1109/78.668544
10.1109/TFUZZ.2005.859320
10.1109/NCC.2011.5734729
10.1109/ETNCC.2011.5958519
10.1121/1.3040022
10.1109/72.991427
10.1109/JSTSP.2010.2080812
10.1109/ICASSP.2018.8461972
10.1109/ICASSP.1992.225957
10.1007/11494683_28
10.21437/Interspeech.2015-350
10.3923/itj.2009.796.800
10.1016/j.patcog.2008.05.008
ContentType Journal Article
Copyright Springer Science+Business Media, LLC, part of Springer Nature 2020
Springer Science+Business Media, LLC, part of Springer Nature 2020.
Copyright_xml – notice: Springer Science+Business Media, LLC, part of Springer Nature 2020
– notice: Springer Science+Business Media, LLC, part of Springer Nature 2020.
DBID AAYXX
CITATION
3V.
7SC
7T9
7WY
7WZ
7XB
87Z
8AL
8AO
8FD
8FE
8FG
8FK
8FL
8G5
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BEZIV
BGLVJ
CCPQU
DWQXO
FRNLG
F~G
GNUQQ
GUQSH
HCIFZ
JQ2
K60
K6~
K7-
L.-
L7M
L~C
L~D
M0C
M0N
M2O
MBDVC
P5Z
P62
PHGZM
PHGZT
PKEHL
PQBIZ
PQBZA
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
DOI 10.1007/s11042-020-10073-7
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
Linguistics and Language Behavior Abstracts (LLBA)
ABI/INFORM Collection
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
Computing Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni)
ProQuest Research Library
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Central
Business Premium Collection
Technology Collection
ProQuest One Community College
ProQuest Central
Business Premium Collection (Alumni)
ABI/INFORM Global (Corporate)
ProQuest Central Student
Research Library Prep
SciTech Premium Collection
ProQuest Computer Science Collection
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
Computer Science Database
ABI/INFORM Professional Advanced
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ABI/INFORM Global
Computing Database
Research Library
Research Library (Corporate)
ProQuest advanced technologies & aerospace journals
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
ProQuest One Academic Middle East (New)
ProQuest One Business
ProQuest One Business (Alumni)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
DatabaseTitle CrossRef
ProQuest Business Collection (Alumni Edition)
Research Library Prep
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
SciTech Premium Collection
ProQuest Central China
ABI/INFORM Complete
ProQuest One Applied & Life Sciences
ProQuest Central (New)
Advanced Technologies & Aerospace Collection
Business Premium Collection
ABI/INFORM Global
ProQuest One Academic Eastern Edition
Linguistics and Language Behavior Abstracts (LLBA)
ProQuest Technology Collection
ProQuest Business Collection
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ABI/INFORM Global (Corporate)
ProQuest One Business
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Central (Alumni Edition)
ProQuest One Community College
Research Library (Alumni Edition)
ProQuest Pharma Collection
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest Central Korea
ProQuest Research Library
Advanced Technologies Database with Aerospace
ABI/INFORM Complete (Alumni Edition)
ProQuest Computing
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Business (Alumni)
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
DatabaseTitleList ProQuest Business Collection (Alumni Edition)

Database_xml – sequence: 1
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1573-7721
EndPage 9457
ExternalDocumentID 10_1007_s11042_020_10073_7
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
.4S
.86
.DC
.VR
06D
0R~
0VY
123
1N0
1SB
2.D
203
28-
29M
2J2
2JN
2JY
2KG
2LR
2P1
2VQ
2~H
30V
3EH
3V.
4.4
406
408
409
40D
40E
5QI
5VS
67Z
6NX
7WY
8AO
8FE
8FG
8FL
8G5
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAOBN
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFO
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACREN
ACSNA
ACZOJ
ADHHG
ADHIR
ADIMF
ADINQ
ADKNI
ADKPE
ADMLS
ADRFC
ADTPH
ADURQ
ADYFF
ADYOE
ADZKW
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFYQB
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMTXH
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARCSS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BBWZM
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BPHCQ
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
EBLON
EBS
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GUQSH
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I-F
I09
IHE
IJ-
IKXTQ
ITG
ITH
ITM
IWAJR
IXC
IXE
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K60
K6V
K6~
K7-
KDC
KOV
KOW
LAK
LLZTM
M0C
M0N
M2O
M4Y
MA-
N2Q
N9A
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P2P
P62
P9O
PF0
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PT5
Q2X
QOK
QOS
R4E
R89
R9I
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TEORI
TH9
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z7R
Z7S
Z7W
Z7X
Z7Y
Z7Z
Z81
Z83
Z86
Z88
Z8M
Z8N
Z8Q
Z8R
Z8S
Z8T
Z8U
Z8W
Z92
ZMTXR
~EX
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ACMFV
ACSTC
ADHKG
ADKFA
AEZWR
AFDZB
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
7SC
7T9
7XB
8AL
8FD
8FK
ABRTQ
JQ2
L.-
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQGLB
PQUKI
PRINS
Q9U
ID FETCH-LOGICAL-c319t-b208bcd5d68a3731f3d376dce3aa772390b6011085a5fd277f097c3764ff7f7a3
IEDL.DBID BENPR
ISSN 1380-7501
IngestDate Fri Jul 25 06:51:19 EDT 2025
Thu Apr 24 23:01:02 EDT 2025
Tue Jul 01 04:13:07 EDT 2025
Fri Feb 21 02:49:28 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords ASR
Feature extraction
Classification models
Language models
Speech recognition
Automatic speech recognition
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c319t-b208bcd5d68a3731f3d376dce3aa772390b6011085a5fd277f097c3764ff7f7a3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-4917-7144
PQID 2494716846
PQPubID 54626
PageCount 47
ParticipantIDs proquest_journals_2494716846
crossref_primary_10_1007_s11042_020_10073_7
crossref_citationtrail_10_1007_s11042_020_10073_7
springer_journals_10_1007_s11042_020_10073_7
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20210300
2021-03-00
20210301
PublicationDateYYYYMMDD 2021-03-01
PublicationDate_xml – month: 3
  year: 2021
  text: 20210300
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
– name: Dordrecht
PublicationSubtitle An International Journal
PublicationTitle Multimedia tools and applications
PublicationTitleAbbrev Multimed Tools Appl
PublicationYear 2021
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References Forsberg M (2003) Why is speech recognition difficult. Chalmers University of Technology.
AnusuyaMAKattiSKFront end analysis of speech recognition: a reviewInt J Speech Technol201114299145
BernardoJMBayarriMJBergerJODawidAPHeckermanDSmithAFMWestMGenerative or discriminative? Getting the best of both worldsBayesian stat2007833242433187
Messaoud Z B, Hamida A B (2010) CDHMM parameters selection for speaker-independent phone recognition in continuous speech system. In MELECON 2010-2010 15th IEEE Mediterranean Electrotechnical conference (pp. 253-258). IEEE.
PaulsonLDSpeech recognition moves from software to hardwareComputer200639111518
Bu H, Du J, Na X, Wu B, Zheng H (2017). Aishell-1: an open-source mandarin speech corpus and a speech recognition baseline. In 2017 20th conference of the oriental chapter of the international coordinating committee on speech databases and speech I/O systems and assessment (O-COCOSDA) (pp. 1-5). IEEE.
Sabah R, Ainon RN (2009) Isolated digit speech recognition in Malay language using neuro-fuzzy approach. In 2009 third Asia international conference on Modelling & Simulation (pp. 336-340). IEEE
DavisKHBiddulphRBalashekSAutomatic recognition of spoken digitsJ Acoust Soc Am1952246637642
Lazli L, Sellami M (2003) Connectionist probability estimators in HMM arabic speech recognition using fuzzy logic. In international workshop on machine learning and data Mining in Pattern Recognition (pp. 379-388). Springer, Berlin, Heidelberg.
Lawrence R (2008) Fundamentals of speech recognition. Pearson Education India.
Tang H, Meng CH, Lee LS (2010) An initial attempt for phoneme recognition using structured support vector machine (SVM). In 2010 IEEE international conference on acoustics, speech and signal processing (pp. 4926-4929). IEEE
Hannun A, Case C, Casper J, Catanzaro B, Diamos G, Elsen E, ..., Ng A Y (2014) Deep speech: Scaling up end-to-end speech recognition. arXiv preprint arXiv:1412.5567.
Toshniwal S, Sainath T N, Weiss R J, Li B, Moreno P, Weinstein E, Rao K (2018) Multilingual speech recognition with a single end-to-end model. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4904-4908). IEEE.
TrentinEGoriMA survey of hybrid ANN/HMM models for automatic speech recognitionNeurocomputing2001371–4911260963.68651
Venkateswarlu R L K, Kumari R V (2011) Novel approach for speech recognition by using self—organized maps. In 2011 international conference on emerging trends in networks and computer communications (ETNCC) (pp. 215-222). IEEE.
Tavanaei A, Manzuri M T, Sameti H (2011) Mel-scaled discrete wavelet transform and dynamic features for the Persian phoneme recognition. In 2011 international symposium on artificial intelligence and signal processing (AISP) (pp. 138-140). IEEE.
ForgieJWForgieCDResults obtained from a vowel recognition computer programJ Acoust Soc Am1959311114801489
TrentinEGoriMRobust combination of neural networks and hidden Markov models for speech recognitionIEEE Trans Neural Netw200314615191531
Leung K F, Leung F H, Lam H K, Tam P K S (2003) Recognition of speech commands using a modified neural fuzzy network and an improved GA. In the 12th IEEE international conference on fuzzy systems, 2003. FUZZ’03. (Vol. 1, pp. 190-195). IEEE.
Lee J Y, Hung J W (2011) Exploiting principal component analysis in modulation spectrum enhancement for robust speech recognition. In 2011 eighth international conference on fuzzy systems and knowledge discovery (FSKD) (Vol. 3, pp. 1947-1951). IEEE.
ThubthongNKijsirikulBSupport vector machines for Thai phoneme recognitionInt J Uncertainty Fuzziness Knowledge Based Syst20019068038131113.68474
PiconeJWSignal modeling techniques in speech recognitionProc IEEE199381912151247
Vapnik V (2013) The nature of statistical learning theory. Springer science & business media
Weston J, Watkins C (1998) Multi-class support vector machines (pp. 98-04). Technical report CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London, may
Saha G, Chakroborty S, Senapati S (2005) A new silence removal and endpoint detection algorithm for speech and speaker recognition applications. In proceedings of the NCC (pp. 56-61).
BussoCBulutMLeeCCKazemzadehAMowerEKimSChangJNLeeSNarayananSSIEMOCAP: interactive emotional dyadic motion capture databaseLang Resour Eval2008424335359
KohonenTSelf-organized formation of topologically correct feature mapsBiol Cybern198243159696678890466.92002
PingZLi-ZhenTDong-FengXSpeech recognition algorithm of parallel subband HMM based on wavelet analysis and neural networkInf Technol J200985796800
Li T F, Chang S C (2007) Speech recognition of mandarin syllables using both linear predict coding cepstra and Mel frequency cepstra. In ROCLING 2007 poster papers (pp. 379-390).
LinCFWangSDFuzzy support vector machinesIEEE Trans Neural Netw2002132464471
Rousseau A, Deléglise P, Esteve Y (2012) TED-LIUM: an automatic speech recognition dedicated corpus. In LREC (pp. 125-129).
Rosenfeld R (1994) A hybrid approach to adaptive statistical language modeling. CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE
MehlaRAggarwalRAutomatic speech recognition: a surveyInt J Adv Res Comput Sci Electron Eng (IJARCSEE)2014314553
O’ShaughnessyDAutomatic speech recognition: history, methods and challengesPattern Recogn20084110296529791161.68772
Zhao Y, Wakita H, Zhuang X (1991) An HMM based speaker-independent continuous speech recognition system with experiments on the TIMIT DATABASE. In acoustics, speech, and signal processing, IEEE international conference on (pp. 333-336). IEEE computer society
AnusuyaMAKattiSKComparison of different speech feature extraction techniques with and without wavelet transform to Kannada speech recognitionInt J Comput Appl20112641924
Garofolo JS (1993) TIMIT acoustic phonetic continuous speech corpus. Linguist Data Consortium 1993
HermanskyHMorganNRASTA processing of speechIEEE Trans Speech Audio Process199424578589
RadhaVVimalaCA review on speech recognition challenges and approachesDoaj Org20122117
Woodland PC, Leggetter CJ, Odell JJ, Valtchev V, Young SJ (1995) The 1994 HTK large vocabulary speech recognition system. In 1995 international conference on acoustics, speech, and signal processing (Vol. 1, pp. 73-76). IEEE
Duan KB, Keerthi SS (2005) Which is the best multiclass SVM method? An empirical study. In international workshop on multiple classifier systems (pp. 278-285). Springer, Berlin, Heidelberg
Solera-Ureña R, Padrell-Sendra J, Martín-Iglesias D, Gallardo-Antolín A, Peláez-Moreno C, Díaz-de-María F (2007) Svms for automatic speech recognition: a survey. In Progress in nonlinear speech processing (pp. 190–216). Springer, Berlin, Heidelberg
VelichkoVMZagoruykoNGAutomatic recognition of 200 wordsInt J Man Mach Stud197023223234
SaeedTRSalmanJAliAHClassification improvement of spoken arabic language based on radial basis functionInt J Electr Comput Eng20199120888708
Gamulkiewicz B, Weeks M (2003) Wavelet based speech recognition. In 2003 46th Midwest symposium on circuits and systems (Vol. 2, pp. 678-681). IEEE.
O'ShaughnessyDLinear predictive codingIEEE potentials1988712932
Nataraj K S, Pandey P C, Shah M S (2011) Improving the consistency of vocal tract shape estimation. In 2011 National Conference on communications (NCC) (pp. 1-5). IEEE.
WalkerSLFooSYOptimal wavelets for speech signal representationsJ Syst Cybern Inform2003144446
Sárosi G, Mozsáry M, Mihajlik P, Fegyó T (2011) Comparison of feature extraction methods for speech recognition in noise-free and in traffic noise environment. In 2011 6th conference on speech technology and human-computer dialogue (SpeD) (pp. 1-8). IEEE.
Chen C P, Bilmes J, Ellis D P (2005) Speech feature smoothing for robust ASR. In proceedings.(ICASSP'05). IEEE international conference on acoustics, speech, and signal processing, 2005. (Vol. 1, pp. I-525). IEEE.
Mohamadpour M, Farokhi F (2009) A new approach for Persian speech recognition. In 2009 IEEE international advance computing conference (pp. 153-158). IEEE
Makino T, Liao H, Assael Y, Shillingford B, Garcia B, Braga O, Siohan O (2019) Recurrent neural network transducer for audio-visual speech recognition. In 2019 IEEE automatic speech recognition and understanding workshop (ASRU) (pp. 905-912). IEEE
Venkateswarlu RLK, Kumari RV, Jayasri GV (2011) Speech recognition using radial basis function neural network. In 2011 3rd international conference on electronics computer technology (Vol. 3, pp. 441-445). IEEE
Du X P, He P L (2006) The clustering solution of speech recognition models with SOM. In international symposium on neural networks (pp. 150-157). Springer, Berlin, Heidelberg.
Coifman R R, Meyer Y, Wickerhauser V (1992) Wavelet analysis and signal processing. In In Wavelets and their applications.
HungJWFanHTSubband feature statistics normalization techniques based on a discrete wavelet transform for robust speech recognitionIEEE Signal Process Lett20091698068092572421
Ranjan S (2010) A discrete wavelet transform based approach to Hindi speech recognition. In 2010 international conference on signal acquisition and processing (pp. 345-348). IEEE.
Tang X (2009) Hybrid hidden Markov model and artificial neural network for automatic speech recognition. In 2009 Pacific-Asia conference on circuits, communications and systems (pp. 682-685). IEEE.
Cutajar M, Gatt E, Micallef J, Grech I, Casha O (2010) Digital hardware implementation of self-organising maps. In Melecon 2010-2010 15th IEEE Mediterranean Electrotechnical conference (pp. 1123-1128). IEEE
Fontaine V, Ris C, Leich H (1996) Nonlinear discriminant analysis with neural networks for speech recognition. In 1996 8th European signal processing conference (EUSIPCO 1996) (pp. 1-4). IEEE.
Bourlard H A, Morgan N (2012). Connectionist speech recognition: a hybrid approach (Vol. 247). Springer Science & Business Media.
Cheng O, Abdulla W, Salcic Z (2005) Performance evaluation of front-end processing for speech recognition systems. The University of Auckland.
Jung S, Son J, Bae K
JW Picone (10073_CR119) 1993; 81
10073_CR156
R Batuwita (10073_CR9) 2010; 18
10073_CR155
NS Nehe (10073_CR109) 2012; 2012
P Kaur (10073_CR71) 2012; 3
10073_CR159
10073_CR72
D O'Shaughnessy (10073_CR113) 1988; 7
10073_CR73
10073_CR74
NU Maheswari (10073_CR93) 2010; 2
H Sakoe (10073_CR138) 1978; 26
10073_CR70
O Birkenes (10073_CR13) 2009; 18
10073_CR69
B Zamani (10073_CR178) 2011; 32
10073_CR65
LR Rabiner (10073_CR123) 1989; 77
10073_CR66
10073_CR150
L Rabiner (10073_CR125) 1981; 29
10073_CR152
10073_CR151
10073_CR153
10073_CR166
SG Mallat (10073_CR96) 1989; 11
C Cortes (10073_CR28) 1995; 20
10073_CR64
10073_CR60
10073_CR58
10073_CR54
SK Gaikwad (10073_CR42) 2010; 10
10073_CR57
CW Hsu (10073_CR59) 2002; 13
Y Wang (10073_CR168) 2012; 21
VV Krishnan (10073_CR78) 2009; 1
10073_CR161
T Kohonen (10073_CR75) 1982; 43
10073_CR162
H Hermansky (10073_CR55) 1990; 87
10073_CR165
JW Forgie (10073_CR39) 1959; 31
10073_CR179
SL Walker (10073_CR167) 2003; 1
10073_CR50
RL Hardy (10073_CR51) 1971; 76
10073_CR52
10073_CR47
10073_CR48
10073_CR49
10073_CR43
C Busso (10073_CR16) 2008; 42
10073_CR45
MA Anusuya (10073_CR4) 2011; 14
10073_CR46
JW Hung (10073_CR63) 2009; 16
10073_CR172
10073_CR171
KR Lekshmi (10073_CR86) 2016; 7
10073_CR174
MA Anusuya (10073_CR5) 2011; 26
10073_CR173
10073_CR175
10073_CR101
10073_CR103
10073_CR102
10073_CR104
10073_CR107
E Trentin (10073_CR157) 2001; 37
10073_CR106
10073_CR40
V Radha (10073_CR126) 2012; 2
10073_CR36
10073_CR37
TR Saeed (10073_CR134) 2019; 9
10073_CR38
10073_CR32
10073_CR34
10073_CR35
A Shewalkar (10073_CR143) 2019; 9
H Jiang (10073_CR67) 2006; 14
L Besacier (10073_CR12) 2014; 56
G Hemakumar (10073_CR53) 2013; 2
M Cutajar (10073_CR30) 2013; 7
X Huang (10073_CR62) 2014; 57
10073_CR111
D O'Shaughnessy (10073_CR114) 2003; 91
JH Friedman (10073_CR41) 1996
10073_CR116
GS Sivaram (10073_CR146) 2011; 20
10073_CR115
NS Nehe (10073_CR108) 2009; 2
10073_CR117
10073_CR31
H Veisi (10073_CR163) 2011; 21
10073_CR25
10073_CR26
10073_CR27
CF Lin (10073_CR90) 2002; 13
10073_CR21
LR Bahl (10073_CR7) 1989; 37
10073_CR22
VM Velichko (10073_CR164) 1970; 2
10073_CR23
MS Crouse (10073_CR29) 1998; 46
10073_CR24
10073_CR3
Z Ping (10073_CR120) 2009; 8
10073_CR122
10073_CR1
10073_CR124
10073_CR127
10073_CR8
10073_CR129
10073_CR6
10073_CR128
10073_CR20
I Mporas (10073_CR105) 2007; 3
JM Bernardo (10073_CR11) 2007; 8
10073_CR14
10073_CR15
AY Vadwala (10073_CR160) 2017; 175
X Huang (10073_CR61) 1993; 7
BH Juang (10073_CR68) 1991; 33
LE Baum (10073_CR10) 1967; 73
10073_CR17
10073_CR98
10073_CR99
10073_CR18
10073_CR19
10073_CR121
D Wang (10073_CR170) 2019; 11
10073_CR133
10073_CR136
10073_CR135
R Mehla (10073_CR97) 2014; 3
10073_CR137
B Yegnanarayana (10073_CR176) 1998; 6
10073_CR139
10073_CR94
10073_CR95
10073_CR91
10073_CR92
D O’Shaughnessy (10073_CR112) 2008; 41
10073_CR87
10073_CR88
S Ganapathy (10073_CR44) 2009; 125
10073_CR89
H Hermansky (10073_CR56) 1994; 2
KH Davis (10073_CR33) 1952; 24
10073_CR130
Y Wang (10073_CR169) 2005; 13
10073_CR132
10073_CR131
10073_CR145
10073_CR144
E Trentin (10073_CR158) 2003; 14
P Nguyen (10073_CR110) 2010; 4
10073_CR147
TS Shanthi (10073_CR142) 2013; 2
10073_CR149
10073_CR148
10073_CR83
10073_CR84
10073_CR85
DH Milone (10073_CR100) 2008; 12
10073_CR80
10073_CR81
10073_CR82
N Thubthong (10073_CR154) 2001; 9
S Abe (10073_CR2) 2003; 21
10073_CR76
10073_CR77
10073_CR79
LD Paulson (10073_CR118) 2006; 39
10073_CR141
10073_CR140
H Yu (10073_CR177) 2011; 58
References_xml – reference: MiloneDHDi PersiaLELearning hidden Markov models with hidden Markov trees as observation distributions. Inteligencia artificialRevista Iberoamericana de Inteligencia Artificial20081237713
– reference: Kupiec J (1989) Probabilistic models of short and long distance word dependencies in running text. In Speech and Natural Language: Proceedings of a Workshop Held at Philadelphia, Pennsylvania, February 21-23, 1989
– reference: SivaramGSHermanskyHSparse multilayer perceptron for phoneme recognitionIEEE Trans Audio Speech Lang Process20112012329
– reference: WalkerSLFooSYOptimal wavelets for speech signal representationsJ Syst Cybern Inform2003144446
– reference: Venkateswarlu RLK, Kumari RV, Jayasri GV (2011) Speech recognition using radial basis function neural network. In 2011 3rd international conference on electronics computer technology (Vol. 3, pp. 441-445). IEEE
– reference: GanapathySThomasSHermanskyHModulation frequency features for phoneme recognition in noisy speechJ Acoust Soc Am20091251EL8EL12
– reference: Woodland PC, Leggetter CJ, Odell JJ, Valtchev V, Young SJ (1995) The 1994 HTK large vocabulary speech recognition system. In 1995 international conference on acoustics, speech, and signal processing (Vol. 1, pp. 73-76). IEEE
– reference: Muller D N, De Siqueira M L, Navaux P O A (2006) A connectionist approach to speech understanding. In the 2006 IEEE international joint conference on neural network proceedings (pp. 3790-3797). IEEE.
– reference: Smaragdis P, Radhakrishnan R, Wilson K W (2009) Context extraction through audio signal analysis. In multimedia content analysis (pp. 1–34). Springer, Boston, MA
– reference: Hermansky H, Morgan N, Bayya A, Kohn P (1991) RASTA-PLP speech analysis. In Proc. IEEE Int’l Conf. Acoustics, speech and signal processing (Vol. 1, pp. 121-124).
– reference: Malekzadeh S, Gholizadeh M H, Razavi S N (2018). Persian vowel recognition with MFCC and ANN on PCVC speech dataset. arXiv preprint arXiv:1812.06953.
– reference: Nataraj K S, Pandey P C, Shah M S (2011) Improving the consistency of vocal tract shape estimation. In 2011 National Conference on communications (NCC) (pp. 1-5). IEEE.
– reference: HsuCWLinCJA comparison of methods for multiclass support vector machinesIEEE Trans Neural Netw2002132415425
– reference: Panayotov V, Chen G, Povey D, Khudanpur S (2015) Librispeech: an asr corpus based on public domain audio books. In 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5206-5210). IEEE.
– reference: Saha G, Chakroborty S, Senapati S (2005) A new silence removal and endpoint detection algorithm for speech and speaker recognition applications. In proceedings of the NCC (pp. 56-61).
– reference: Deshmukh N, Picone J (1995) Methodologies for language modeling and search in continuous speech recognition. In proceedings IEEE Southeastcon’95. Visualize the future (pp. 192-198). IEEE
– reference: Sainath TN, Pang R, Rybach D, He Y, Prabhavalkar R, Li W, ..., McGraw I (2019) Two-pass end-to-end speech recognition. arXiv preprint arXiv:1908.10992
– reference: PiconeJWSignal modeling techniques in speech recognitionProc IEEE199381912151247
– reference: WangYWangSLaiKKA new fuzzy support vector machine to evaluate credit riskIEEE Trans Fuzzy Syst2005136820831
– reference: Tang H, Meng CH, Lee LS (2010) An initial attempt for phoneme recognition using structured support vector machine (SVM). In 2010 IEEE international conference on acoustics, speech and signal processing (pp. 4926-4929). IEEE
– reference: TrentinEGoriMRobust combination of neural networks and hidden Markov models for speech recognitionIEEE Trans Neural Netw200314615191531
– reference: BernardoJMBayarriMJBergerJODawidAPHeckermanDSmithAFMWestMGenerative or discriminative? Getting the best of both worldsBayesian stat2007833242433187
– reference: Islam J, Mubassira M, Islam MR, Das AK (2019) A speech recognition system for Bengali language using recurrent neural network. In 2019 IEEE 4th international conference on computer and communication systems (ICCCS) (pp. 73-76). IEEE
– reference: Barker J, Watanabe S, Vincent E, Trmal J (2018) The fifth’CHiME’speech separation and recognition challenge: dataset, task and baselines. arXiv preprint arXiv:1803.10609.
– reference: Rybach D, Gollan C, Heigold G, Hoffmeister B, Lööf J, Schlüter R, Ney H (2009) The RWTH Aachen University open source speech recognition system. In Tenth Annual Conference of the International Speech Communication Association
– reference: AbeSAnalysis of multiclass support vector machinesThyroid20032133772
– reference: Hou X (2009) Noise robust speech recognition based on wavelet-RBF neural network. In PIAGENG 2009: intelligent information, control, and communication Technology for Agricultural Engineering (Vol. 7490, p. 74902O). International Society for Optics and Photonics
– reference: RabinerLLevinsonSIsolated and connected word recognition-theory and selected applicationsIEEE Trans Commun1981295621659
– reference: Nouza J, Zdansky J, Cerva P (2010) System for automatic collection, annotation and indexing of Czech broadcast speech with full-text search. In MELECON 2010–2010 15th IEEE Mediterranean Electrotechnical Conference (pp. 202–205). IEEE
– reference: Lee A, Kawahara T, Shikano K (2001) Julius---an open source real-time large vocabulary recognition engine
– reference: Sukumar AR, Shah AF, Anto PB (2010) Isolated question words recognition from speech queries by using artificial neural networks. In 2010 second international conference on computing, communication and networking technologies (pp. 1-4). IEEE.
– reference: ShanthiTSLingamCReview of feature extraction techniques in automatic speech recognitionInt J Sci Eng Technol201326479484
– reference: Zhao Y, Wakita H, Zhuang X (1991) An HMM based speaker-independent continuous speech recognition system with experiments on the TIMIT DATABASE. In acoustics, speech, and signal processing, IEEE international conference on (pp. 333-336). IEEE computer society
– reference: Duan KB, Keerthi SS (2005) Which is the best multiclass SVM method? An empirical study. In international workshop on multiple classifier systems (pp. 278-285). Springer, Berlin, Heidelberg
– reference: GaikwadSKGawaliBWYannawarPA review on speech recognition techniqueInt J Comput Appl20101031624
– reference: Kesarkar M P (2003) Feature extraction for speech recognition. Electronic systems, EE. Dept., IIT Bombay.
– reference: BussoCBulutMLeeCCKazemzadehAMowerEKimSChangJNLeeSNarayananSSIEMOCAP: interactive emotional dyadic motion capture databaseLang Resour Eval2008424335359
– reference: HermanskyHPerceptual linear predictive (PLP) analysis of speech. TheJ Acoust Soc Am199087417381752
– reference: Polikar R (1996) The wavelet tutorial.
– reference: BahlLRBrownPFde SouzaPVMercerRLA tree-based statistical language model for natural language speech recognitionIEEE Trans Acoust Speech Signal Process198937710011008
– reference: HermanskyHMorganNRASTA processing of speechIEEE Trans Speech Audio Process199424578589
– reference: Toshniwal S, Sainath T N, Weiss R J, Li B, Moreno P, Weinstein E, Rao K (2018) Multilingual speech recognition with a single end-to-end model. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4904-4908). IEEE.
– reference: Jung S, Son J, Bae K (2004) Feature extraction based on wavelet domain hidden Markov tree model for robust speech recognition. In Australasian joint conference on artificial intelligence (pp. 1154-1159). Springer, Berlin, Heidelberg.
– reference: Gamulkiewicz B, Weeks M (2003) Wavelet based speech recognition. In 2003 46th Midwest symposium on circuits and systems (Vol. 2, pp. 678-681). IEEE.
– reference: Rousseau A, Deléglise P, Esteve Y (2012) TED-LIUM: an automatic speech recognition dedicated corpus. In LREC (pp. 125-129).
– reference: YegnanarayanaBVeldhuisRNExtraction of vocal-tract system characteristics from speech signalsIEEE Trans Speech Audio Process199864313327
– reference: O'ShaughnessyDLinear predictive codingIEEE potentials1988712932
– reference: Rosenblatt F (1961). Principles of neurodynamics. Perceptrons and the theory of brain mechanisms (no. VG-1196-G-8). Cornell aeronautical lab Inc Buffalo NY
– reference: Makino T, Liao H, Assael Y, Shillingford B, Garcia B, Braga O, Siohan O (2019) Recurrent neural network transducer for audio-visual speech recognition. In 2019 IEEE automatic speech recognition and understanding workshop (ASRU) (pp. 905-912). IEEE
– reference: NeheNSHolambeRSDWT and LPC based feature extraction methods for isolated word recognitionEURASIP J Audio Speech Music Process2012201217
– reference: Singh MT, Fayjie AR, Kachari B (2015) A survey report on speech recognition system. Int J Comput Appl 121(11)
– reference: TrentinEGoriMA survey of hybrid ANN/HMM models for automatic speech recognitionNeurocomputing2001371–4911260963.68651
– reference: Umarani SD, Raviram P, Wahidabanu RSD (2009) Implementation of HMM and radial basis function for speech recognition. In 2009 international conference on Intelligent Agent & Multi-Agent Systems (pp. 1-4). IEEE
– reference: KrishnanVVAntoPBFeatures of wavelet packet decomposition and discrete wavelet transform for malayalam speech recognitionInt J Recent Trends Eng20091293
– reference: Alkhaldi W, Fakhr W, Hamdy N (2002) Automatic speech/speaker recognition in noisy environments using wavelet transform, The 2002 45th Midwest Symposium on Circuits and Systems, 2002. MWSCAS-2002., Tulsa, OK, USA, pp. I-463, doi: https://doi.org/10.1109/MWSCAS.2002.1187258.
– reference: Coifman R R, Meyer Y, Wickerhauser V (1992) Wavelet analysis and signal processing. In In Wavelets and their applications.
– reference: Chang T H, Luo Z Q, Deng L, Chi C Y (2008) A convex optimization method for joint mean and variance parameter estimation of large-margin CDHMM. In 2008 IEEE international conference on acoustics, speech and signal processing (pp. 4053-4056). IEEE.
– reference: RadhaVVimalaCA review on speech recognition challenges and approachesDoaj Org20122117
– reference: BirkenesOMatsuiTTanabeKSiniscalchiSMMyrvollTAJohnsenMHPenalized logistic regression with HMM log-likelihood regressors for speech recognitionIEEE Trans Audio Speech Lang Process200918614401454
– reference: Kriman S, Beliaev S, Ginsburg B, Huang J, Kuchaiev O, Lavrukhin V, ..., Zhang Y (2020) Quartznet: Deep automatic speech recognition with 1d time-channel separable convolutions. In ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6124–6128). IEEE
– reference: HemakumarGPunithaPSpeech recognition technology: a survey on Indian languagesInt J Inf Sci Intell Syst201324138
– reference: Rabiner L, Juang B H (1993) Fundamental of speech recognition prentice-hall international.
– reference: Weston J, Watkins C (1998) Multi-class support vector machines (pp. 98-04). Technical report CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London, may
– reference: KaurPSinghPGargVSpeech recognition system; challenges and techniquesInt J Comput Sci Inf Technol20123339893992
– reference: Cutajar M, Gatt E, Micallef J, Grech I, Casha O (2010) Digital hardware implementation of self-organising maps. In Melecon 2010-2010 15th IEEE Mediterranean Electrotechnical conference (pp. 1123-1128). IEEE
– reference: Cheng O, Abdulla W, Salcic Z (2005) Performance evaluation of front-end processing for speech recognition systems. The University of Auckland.
– reference: Dansena D K, Rathore Y A Survey Paper on Automatic Speech Recognition by Machine
– reference: Juang B H, Rabiner L R (2005) Automatic speech recognition–a brief history of the technology development. Georgia Institute of Technology. Atlanta Rutgers University and the University of California. Santa Barbara, 1, 67.
– reference: Sonkamble BA, Doye DD, Sonkamble S, PICT P, MMCOE P (2009) An efficient use of support vector machines for speech signal classification. In Proc eighth WSEAS Int Conf computational intelligence., man-machine systems and cybernetics (pp. 117-120)
– reference: Du X P, He P L (2006) The clustering solution of speech recognition models with SOM. In international symposium on neural networks (pp. 150-157). Springer, Berlin, Heidelberg.
– reference: HungJWFanHTSubband feature statistics normalization techniques based on a discrete wavelet transform for robust speech recognitionIEEE Signal Process Lett20091698068092572421
– reference: ThubthongNKijsirikulBSupport vector machines for Thai phoneme recognitionInt J Uncertainty Fuzziness Knowledge Based Syst20019068038131113.68474
– reference: Helmi N, Helmi BH (2008) Speech recognition with fuzzy neural network for discrete words. In 2008 fourth international conference on natural computation (Vol. 7, pp. 265-269). IEEE
– reference: Vapnik V (2013) The nature of statistical learning theory. Springer science & business media
– reference: O'ShaughnessyDInteracting with computers by voice: automatic speech recognition and synthesisProc IEEE200391912721305
– reference: SakoeHChibaSDynamic programming algorithm optimization for spoken word recognitionIEEE Trans Acoust Speech Signal Process197826143490371.68035
– reference: Li T F, Chang S C (2007) Speech recognition of mandarin syllables using both linear predict coding cepstra and Mel frequency cepstra. In ROCLING 2007 poster papers (pp. 379-390).
– reference: Tóth L (2011) A hierarchical, context-dependent neural network architecture for improved phone recognition. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5040–5043). IEEE
– reference: BatuwitaRPaladeVFSVM-CIL: fuzzy support vector machines for class imbalance learningIEEE Trans Fuzzy Syst2010183558571
– reference: Tavanaei A, Manzuri M T, Sameti H (2011) Mel-scaled discrete wavelet transform and dynamic features for the Persian phoneme recognition. In 2011 international symposium on artificial intelligence and signal processing (AISP) (pp. 138-140). IEEE.
– reference: AnusuyaMAKattiSKFront end analysis of speech recognition: a reviewInt J Speech Technol201114299145
– reference: VadwalaAYSutharKAKarmakarYAPandyaNSurvey paper on different speech recognition algorithm: challenges and techniquesInt J Comput Appl201717513136
– reference: BaumLEEagonJAAn inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecologyBull Am Math Soc19677333603632102170157.11101
– reference: Lin CT (1996) Neural fuzzy systems: a neuro-fuzzy synergism to intelligent systems. Prentice hall PTR
– reference: Lowerre BT (1976) The HARPY speech recognition system. CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE
– reference: Wang B, Yin Y, Lin H (2020) Attention-based transducer for online speech recognition. arXiv preprint arXiv:2005.08497
– reference: Tang X (2009) Hybrid hidden Markov model and artificial neural network for automatic speech recognition. In 2009 Pacific-Asia conference on circuits, communications and systems (pp. 682-685). IEEE.
– reference: NeheNSHolambeRSNew feature extraction techniques for Marathi digit recognitionInt J Recent Trends Eng20092222
– reference: ZamaniBAkbariANasersharifBJalalvandAOptimized discriminative transformations for speech features based on minimum classification errorPattern Recogn Lett2011327948955
– reference: Hu X, Zhan L, Xue Y, Zhou W, Zhang L (2011) Spoken arabic digits recognition based on wavelet neural networks. In 2011 IEEE international conference on systems, man, and cybernetics (pp. 1481-1485). IEEE.
– reference: Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev, 1–62
– reference: Hennebert J, Hasler M, Dedieu H (1994) Neural networks in speech recognition. Department of Electrical Engineering, Swiss Federal Institute of Technology, 1015.
– reference: Sivaram GS, Hermansky H (2011) Multilayer perceptron with sparse hidden outputs for phoneme recognition. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5336-5339). IEEE
– reference: Hunt A, Favero R (1994) Using principal component analysis with wavelets in speech recognition. In SST Conf., ASSTA Inc., Perth (pp. 296-301).
– reference: VeisiHSametiHThe integration of principal component analysis and cepstral mean subtraction in parallel model combination for robust speech recognitionDigital Signal Process20112113653
– reference: Sak H, Senior A, Rao K, Beaufays F (2015) Fast and accurate recurrent neural network acoustic models for speech recognition. arXiv preprint arXiv:1507.06947.
– reference: HuangXBakerJReddyRA historical perspective of speech recognitionCommun ACM201457194103
– reference: MporasIGanchevTSiafarikasMFakotakisNComparison of speech features on the speech recognition taskJ Comput Sci200738608616
– reference: Rosenfeld R, Huang X (1992) Improvements in stochastic language modeling. In Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992
– reference: DavisKHBiddulphRBalashekSAutomatic recognition of spoken digitsJ Acoust Soc Am1952246637642
– reference: BesacierLBarnardEKarpovASchultzTAutomatic speech recognition for under-resourced languages: a surveySpeech Comm20145685100
– reference: Köhn A, Stegen F, Baumann T (2016) Mining the spoken wikipedia for speech data and beyond. In proceedings of the tenth international conference on language resources and evaluation (LREC’16) (pp. 4644-4647).
– reference: Dumitru C O, Gavat I (2006) A comparative study of feature extraction methods applied to continuous speech recognition in romanian language. In proceedings ELMAR 2006 (pp. 115-118). IEEE.
– reference: PingZLi-ZhenTDong-FengXSpeech recognition algorithm of parallel subband HMM based on wavelet analysis and neural networkInf Technol J200985796800
– reference: Hai J, Joo E M (2003) Improved linear predictive coding method for speech recognition. In fourth international conference on information, communications and signal processing, 2003 and the fourth Pacific rim conference on multimedia. Proceedings of the 2003 joint (Vol. 3, pp. 1614-1618). IEEE.
– reference: JiangHLiXLiuCLarge margin hidden Markov models for speech recognitionIEEE Trans Audio Speech Lang Process200614515841595
– reference: JuangBHRabinerLRHidden Markov models for speech recognitionTechnometrics199133325127211326650762.62036
– reference: Rosenfeld R (1994) A hybrid approach to adaptive statistical language modeling. CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE
– reference: RabinerLRA tutorial on hidden Markov models and selected applications in speech recognitionProc IEEE1989772257286
– reference: WangDWangXLvSEnd-to-end mandarin speech recognition combining CNN and BLSTMSymmetry2019115644
– reference: Hannun A, Case C, Casper J, Catanzaro B, Diamos G, Elsen E, ..., Ng A Y (2014) Deep speech: Scaling up end-to-end speech recognition. arXiv preprint arXiv:1412.5567.
– reference: ForgieJWForgieCDResults obtained from a vowel recognition computer programJ Acoust Soc Am1959311114801489
– reference: Chen C P, Bilmes J, Ellis D P (2005) Speech feature smoothing for robust ASR. In proceedings.(ICASSP'05). IEEE international conference on acoustics, speech, and signal processing, 2005. (Vol. 1, pp. I-525). IEEE.
– reference: Illina I, Gong Y (1996) Improvement in N-best search for continuous speech recognition. In proceeding of fourth international conference on spoken language processing. ICSLP'96 (Vol. 4, pp. 2147-2150). IEEE
– reference: Sabah R, Ainon RN (2009) Isolated digit speech recognition in Malay language using neuro-fuzzy approach. In 2009 third Asia international conference on Modelling & Simulation (pp. 336-340). IEEE
– reference: Chiu, C. C., Sainath, T. N., Wu, Y., Prabhavalkar, R., Nguyen, P., Chen, Z., ..., Jaitly, N. (2018) State-of-the-art speech recognition with sequence-to-sequence models. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4774–4778). IEEE.
– reference: Sárosi G, Mozsáry M, Mihajlik P, Fegyó T (2011) Comparison of feature extraction methods for speech recognition in noise-free and in traffic noise environment. In 2011 6th conference on speech technology and human-computer dialogue (SpeD) (pp. 1-8). IEEE.
– reference: Garofolo JS (1993) TIMIT acoustic phonetic continuous speech corpus. Linguist Data Consortium 1993
– reference: LekshmiKRElizabethSAutomatic speech recognition using different neural network architectures – a surveyInt J Comput Sci Inf Technol20167624222427
– reference: Forsberg M (2003) Why is speech recognition difficult. Chalmers University of Technology.
– reference: MaheswariNUKabilanAPVenkateshRA hybrid model of neural network approach for speaker independent word recognitionInt J Comput Theory Eng201026912
– reference: Paul AK, Das D, Kamal MM (2009) Bangla speech recognition system using LPC and ANN. In 2009 seventh international conference on advances in pattern recognition (pp. 171-174). IEEE
– reference: Chow YL, Schwartz R (1989) The n-best algorithm: an efficient procedure for finding top n sentence hypotheses. In proceedings of the workshop on speech and natural language (pp. 199-202). Association for Computational Linguistics
– reference: Lazli L, Sellami M (2003) Connectionist probability estimators in HMM arabic speech recognition using fuzzy logic. In international workshop on machine learning and data Mining in Pattern Recognition (pp. 379-388). Springer, Berlin, Heidelberg.
– reference: NguyenPHeigoldGZweigGSpeech recognition with flat direct modelsIEEE J Sel Top Sign Proces2010469941006
– reference: CortesCVapnikVSupport-vector networksMach Learn19952032732970831.68098
– reference: Venkateswarlu R L K, Kumari R V (2011) Novel approach for speech recognition by using self—organized maps. In 2011 international conference on emerging trends in networks and computer communications (ETNCC) (pp. 215-222). IEEE.
– reference: MehlaRAggarwalRAutomatic speech recognition: a surveyInt J Adv Res Comput Sci Electron Eng (IJARCSEE)2014314553
– reference: Weston J, Watkins C (1999) Support vector machines for multi-class pattern recognition. In Esann (Vol. 99, pp. 219-224)
– reference: Korba M C A, Messadeg D, Djemili R, Bourouba H (2008) Robust speech recognition using perceptual wavelet denoising and mel-frequency product spectrum cepstral coefficient features. Informatica, 32(3).
– reference: LinCFWangSDFuzzy support vector machinesIEEE Trans Neural Netw2002132464471
– reference: Pallett DS, Fiscus JG, Garofolo JS (1990) DARPA resource management. In speech and natural language: proceedings of a workshop held at Hidden Valley, Pennsylvania, June 24-27, 1990 (p. 298). Morgan Kaufmann pub
– reference: Messaoud Z B, Hamida A B (2010) CDHMM parameters selection for speaker-independent phone recognition in continuous speech system. In MELECON 2010-2010 15th IEEE Mediterranean Electrotechnical conference (pp. 253-258). IEEE.
– reference: Sha F, Saul LK (2007) Large margin hidden Markov models for automatic speech recognition. In advances in neural information processing systems (pp. 1249-1256)
– reference: WangYHanKWangDExploring monaural features for classification-based speech segregationIEEE Trans Audio Speech Lang Process2012212270279
– reference: YuHXieTPaszczynskiSWilamowskiBMAdvantages of radial basis function networks for dynamic system designIEEE Trans Ind Electron2011581254385450
– reference: Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In proceedings of the 23rd international conference on machine learning (pp. 369-376)
– reference: Chow Y, Dunham M, Kimball O, Krasner M, Kubala G, Makhoul J, ..., Schwartz R (1987) BYBLOS: The BBN continuous speech recognition system. In ICASSP'87. IEEE International Conference on Acoustics, Speech, and Signal Processing (Vol. 12, pp. 89–92). IEEE
– reference: Veaux C, Yamagishi J, MacDonald K (2016) Superseded-cstr vctk corpus: English multi-speaker corpus for cstr voice cloning toolkit.
– reference: VelichkoVMZagoruykoNGAutomatic recognition of 200 wordsInt J Man Mach Stud197023223234
– reference: SaeedTRSalmanJAliAHClassification improvement of spoken arabic language based on radial basis functionInt J Electr Comput Eng20199120888708
– reference: MallatSGA theory for multiresolution signal decomposition: the wavelet representationIEEE Trans Pattern Anal Mach Intell19891176746930709.94650
– reference: PaulsonLDSpeech recognition moves from software to hardwareComputer200639111518
– reference: Wijoyo S, Wijoyo S (2011) Speech recognition using linear predictive coding and artificial neural network for controlling movement of mobile robot. In proceedings of 2011 international conference on information and electronics engineering (ICIEE 2011) (pp. 28-29).
– reference: Krüger SE, Schafföner M, Katz M, Andelic E, Wendemuth A (2005) Speech recognition with support vector machines in a hybrid system. In Ninth European Conference on Speech Communication and Technology
– reference: CutajarMGattEGrechICashaOMicallefJComparative study of automatic speech recognition techniquesIET Signal Proc2013712546
– reference: Molau S, Pitz M, Schluter R, Ney H (2001) Computing mel-frequency cepstral coefficients on the power spectrum. In 2001 IEEE international conference on acoustics, speech, and signal processing. Proceedings (cat. No. 01CH37221) (Vol. 1, pp. 73-76). IEEE.
– reference: Solera-Ureña R, Padrell-Sendra J, Martín-Iglesias D, Gallardo-Antolín A, Peláez-Moreno C, Díaz-de-María F (2007) Svms for automatic speech recognition: a survey. In Progress in nonlinear speech processing (pp. 190–216). Springer, Berlin, Heidelberg
– reference: Fontaine V, Ris C, Leich H (1996) Nonlinear discriminant analysis with neural networks for speech recognition. In 1996 8th European signal processing conference (EUSIPCO 1996) (pp. 1-4). IEEE.
– reference: Leung K F, Leung F H, Lam H K, Tam P K S (2003) Recognition of speech commands using a modified neural fuzzy network and an improved GA. In the 12th IEEE international conference on fuzzy systems, 2003. FUZZ’03. (Vol. 1, pp. 190-195). IEEE.
– reference: Chan W, Jaitly N, Le Q, Vinyals O (2016) Listen, attend and spell: a neural network for large vocabulary conversational speech recognition. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4960-4964). IEEE.
– reference: KohonenTSelf-organized formation of topologically correct feature mapsBiol Cybern198243159696678890466.92002
– reference: Lawrence R (2008) Fundamentals of speech recognition. Pearson Education India.
– reference: Clarkson P, Moreno PJ (1999) On the use of support vector machines for phonetic classification. In 1999 IEEE international conference on acoustics, speech, and signal processing. Proceedings. ICASSP99 (cat. No. 99CH36258) (Vol. 2, pp. 585-588). IEEE
– reference: Sayers C (1991). Self organizing feature maps and their applications to robotics
– reference: Abdulla W H, Kasabov N (1999) The concepts of hidden Markov model in speech recognition.
– reference: Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, ..., Silovsky J (2011) The Kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding (No. CONF). IEEE Signal Process Soc
– reference: Collobert R, Puhrsch C, Synnaeve G (2016) Wav2letter: an end-to-end convnet-based speech recognition system. arXiv preprint arXiv:1609.03193.
– reference: Lamere P, Kwok P, Gouvea E, Raj B, Singh R, Walker W, ..., Wolf P (2003) The CMU SPHINX-4 speech recognition system. In IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2003), Hong Kong (Vol. 1, pp. 2–5)
– reference: Mohamadpour M, Farokhi F (2009) A new approach for Persian speech recognition. In 2009 IEEE international advance computing conference (pp. 153-158). IEEE
– reference: Meyer Y (1993) Wavelets: Algorithms and Applications, SIAM, Philadelphia, 1993. MR 95f, 94005.
– reference: ShewalkarANyavanandiDLudwigSAPerformance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRUJ Artif Intel Soft Comput Res201994235245
– reference: Atmaja BT, Akagi M (2020) Deep multilayer Perceptrons for dimensional speech emotion recognition. arXiv preprint arXiv:2004.02355.
– reference: Modic R, Lindberg B, Petek B (2003) Comparative wavelet and mfcc speech recognition experiments on the slovenian and english speechdat2. In ISCA tutorial and research workshop on non-linear speech processing
– reference: Lee J Y, Hung J W (2011) Exploiting principal component analysis in modulation spectrum enhancement for robust speech recognition. In 2011 eighth international conference on fuzzy systems and knowledge discovery (FSKD) (Vol. 3, pp. 1947-1951). IEEE.
– reference: HuangXAllevaFHonHWHwangMYLeeKFRosenfeldRThe SPHINX-II speech recognition system: an overviewComput Speech Lang199372137148
– reference: Bu H, Du J, Na X, Wu B, Zheng H (2017). Aishell-1: an open-source mandarin speech corpus and a speech recognition baseline. In 2017 20th conference of the oriental chapter of the international coordinating committee on speech databases and speech I/O systems and assessment (O-COCOSDA) (pp. 1-5). IEEE.
– reference: Campos MM, Carpenter GA (1998) WSOM: building adaptive wavelets with self-organizing maps. In 1998 IEEE international joint conference on neural networks proceedings. IEEE world congress on computational intelligence (cat. No. 98CH36227) (Vol. 1, pp. 763-767). IEEE
– reference: FriedmanJHAnother approach to polychotomous classification1996Technical ReportStatistics Department, Stanford University
– reference: HardyRLMultiquadric equations of topography and other irregular surfacesJ Geophys Res197176819051915
– reference: Gupta M, Gilbert A (2001) Robust speech recognition using wavelet coefficient features. In IEEE workshop on automatic speech recognition and understanding, 2001. ASRU'01. (pp. 445-448). IEEE.
– reference: Morgan N, Bourlard H (1990). Continuous speech recognition using multilayer perceptrons with hidden Markov models. In international conference on acoustics, speech, and signal processing (pp. 413-416). IEEE
– reference: Halabi N (2016) Modern standard arabic phonetics for speech synthesis (Doctoral dissertation, University of Southampton).
– reference: AnusuyaMAKattiSKComparison of different speech feature extraction techniques with and without wavelet transform to Kannada speech recognitionInt J Comput Appl20112641924
– reference: Ranjan S (2010) A discrete wavelet transform based approach to Hindi speech recognition. In 2010 international conference on signal acquisition and processing (pp. 345-348). IEEE.
– reference: Bourlard H A, Morgan N (2012). Connectionist speech recognition: a hybrid approach (Vol. 247). Springer Science & Business Media.
– reference: O’ShaughnessyDAutomatic speech recognition: history, methods and challengesPattern Recogn20084110296529791161.68772
– reference: Liu X (2009) A new wavelet threshold denoising algorithm in speech recognition. In 2009 Asia-Pacific conference on information processing (Vol. 2, pp. 310-313). IEEE.
– reference: CrouseMSNowakRDBaraniukRGWavelet-based statistical signal processing using hidden Markov modelsIEEE Trans Signal Process19984648869021665651
– ident: 10073_CR8
– ident: 10073_CR174
– volume: 18
  start-page: 558
  issue: 3
  year: 2010
  ident: 10073_CR9
  publication-title: IEEE Trans Fuzzy Syst
  doi: 10.1109/TFUZZ.2010.2042721
– ident: 10073_CR40
– volume: 12
  start-page: 7
  issue: 37
  year: 2008
  ident: 10073_CR100
  publication-title: Revista Iberoamericana de Inteligencia Artificial
– ident: 10073_CR19
– ident: 10073_CR24
  doi: 10.3115/1075434.1075467
– ident: 10073_CR111
  doi: 10.1109/MELCON.2010.5476306
– ident: 10073_CR95
– ident: 10073_CR72
– ident: 10073_CR34
– volume: 81
  start-page: 1215
  issue: 9
  year: 1993
  ident: 10073_CR119
  publication-title: Proc IEEE
  doi: 10.1109/5.237532
– volume: 2
  start-page: 1
  issue: 1
  year: 2012
  ident: 10073_CR126
  publication-title: Doaj Org
– volume: 18
  start-page: 1440
  issue: 6
  year: 2009
  ident: 10073_CR13
  publication-title: IEEE Trans Audio Speech Lang Process
  doi: 10.1109/TASL.2009.2035151
– volume: 14
  start-page: 99
  issue: 2
  year: 2011
  ident: 10073_CR4
  publication-title: Int J Speech Technol
  doi: 10.1007/s10772-010-9088-7
– volume: 37
  start-page: 91
  issue: 1–4
  year: 2001
  ident: 10073_CR157
  publication-title: Neurocomputing
  doi: 10.1016/S0925-2312(00)00308-8
– ident: 10073_CR65
  doi: 10.21437/ICSLP.1996-544
– ident: 10073_CR141
  doi: 10.7551/mitpress/7503.003.0161
– ident: 10073_CR133
  doi: 10.1109/AMS.2009.101
– volume: 2
  start-page: 1
  issue: 4
  year: 2013
  ident: 10073_CR53
  publication-title: Int J Inf Sci Intell Syst
– volume: 6
  start-page: 313
  issue: 4
  year: 1998
  ident: 10073_CR176
  publication-title: IEEE Trans Speech Audio Process
  doi: 10.1109/89.701359
– ident: 10073_CR98
– volume: 76
  start-page: 1905
  issue: 8
  year: 1971
  ident: 10073_CR51
  publication-title: J Geophys Res
  doi: 10.1029/JB076i008p01905
– volume: 73
  start-page: 360
  issue: 3
  year: 1967
  ident: 10073_CR10
  publication-title: Bull Am Math Soc
  doi: 10.1090/S0002-9904-1967-11751-8
– volume: 7
  start-page: 137
  issue: 2
  year: 1993
  ident: 10073_CR61
  publication-title: Comput Speech Lang
  doi: 10.1006/csla.1993.1007
– ident: 10073_CR131
– ident: 10073_CR45
– volume: 14
  start-page: 1519
  issue: 6
  year: 2003
  ident: 10073_CR158
  publication-title: IEEE Trans Neural Netw
  doi: 10.1109/TNN.2003.820838
– ident: 10073_CR124
– volume: 8
  start-page: 3
  issue: 3
  year: 2007
  ident: 10073_CR11
  publication-title: Bayesian stat
– ident: 10073_CR17
  doi: 10.1109/IJCNN.1998.682377
– ident: 10073_CR162
– ident: 10073_CR14
– ident: 10073_CR25
  doi: 10.1109/ICASSP.1999.759734
– ident: 10073_CR128
– ident: 10073_CR49
– ident: 10073_CR73
  doi: 10.1007/s10462-020-09825-6
– ident: 10073_CR139
  doi: 10.1109/SPED.2011.5940729
– ident: 10073_CR23
  doi: 10.1109/ICASSP.1987.1169748
– ident: 10073_CR172
– ident: 10073_CR18
  doi: 10.1109/ICASSP.2016.7472621
– volume: 31
  start-page: 1480
  issue: 11
  year: 1959
  ident: 10073_CR39
  publication-title: J Acoust Soc Am
  doi: 10.1121/1.1907653
– volume: 2012
  start-page: 7
  issue: 1
  year: 2012
  ident: 10073_CR109
  publication-title: EURASIP J Audio Speech Music Process
  doi: 10.1186/1687-4722-2012-7
– volume: 16
  start-page: 806
  issue: 9
  year: 2009
  ident: 10073_CR63
  publication-title: IEEE Signal Process Lett
  doi: 10.1109/LSP.2009.2024113
– volume: 29
  start-page: 621
  issue: 5
  year: 1981
  ident: 10073_CR125
  publication-title: IEEE Trans Commun
  doi: 10.1109/TCOM.1981.1095031
– volume: 42
  start-page: 335
  issue: 4
  year: 2008
  ident: 10073_CR16
  publication-title: Lang Resour Eval
  doi: 10.1007/s10579-008-9076-6
– ident: 10073_CR48
  doi: 10.1109/ICICS.2003.1292740
– ident: 10073_CR148
  doi: 10.1007/978-3-540-71505-4_11
– volume: 32
  start-page: 948
  issue: 7
  year: 2011
  ident: 10073_CR178
  publication-title: Pattern Recogn Lett
  doi: 10.1016/j.patrec.2011.01.017
– volume: 14
  start-page: 1584
  issue: 5
  year: 2006
  ident: 10073_CR67
  publication-title: IEEE Trans Audio Speech Lang Process
  doi: 10.1109/TASL.2006.879805
– ident: 10073_CR69
– volume: 37
  start-page: 1001
  issue: 7
  year: 1989
  ident: 10073_CR7
  publication-title: IEEE Trans Acoust Speech Signal Process
  doi: 10.1109/29.32278
– volume: 2
  start-page: 578
  issue: 4
  year: 1994
  ident: 10073_CR56
  publication-title: IEEE Trans Speech Audio Process
  doi: 10.1109/89.326616
– ident: 10073_CR159
  doi: 10.1109/IAMA.2009.5228022
– ident: 10073_CR66
  doi: 10.1109/CCOMS.2019.8821629
– volume: 91
  start-page: 1272
  issue: 9
  year: 2003
  ident: 10073_CR114
  publication-title: Proc IEEE
  doi: 10.1109/JPROC.2003.817117
– ident: 10073_CR145
  doi: 10.1109/ICASSP.2011.5947563
– volume: 2
  start-page: 912
  issue: 6
  year: 2010
  ident: 10073_CR93
  publication-title: Int J Comput Theory Eng
  doi: 10.7763/IJCTE.2010.V2.262
– ident: 10073_CR117
  doi: 10.1109/ICAPR.2009.80
– volume: 87
  start-page: 1738
  issue: 4
  year: 1990
  ident: 10073_CR55
  publication-title: J Acoust Soc Am
  doi: 10.1121/1.399423
– ident: 10073_CR129
  doi: 10.21236/ADA458711
– ident: 10073_CR106
  doi: 10.1109/IJCNN.2006.247398
– volume: 3
  start-page: 608
  issue: 8
  year: 2007
  ident: 10073_CR105
  publication-title: J Comput Sci
  doi: 10.3844/jcssp.2007.608.616
– ident: 10073_CR153
  doi: 10.1109/AISP.2011.5960989
– ident: 10073_CR166
  doi: 10.1109/ICECTECH.2011.5941788
– ident: 10073_CR27
– volume: 77
  start-page: 257
  issue: 2
  year: 1989
  ident: 10073_CR123
  publication-title: Proc IEEE
  doi: 10.1109/5.18626
– volume: 21
  start-page: 3772
  issue: 3
  year: 2003
  ident: 10073_CR2
  publication-title: Thyroid
– ident: 10073_CR156
  doi: 10.1109/ICASSP.2011.5947489
– ident: 10073_CR15
  doi: 10.1109/ICSDA.2017.8384449
– ident: 10073_CR81
– volume: 1
  start-page: 44
  issue: 4
  year: 2003
  ident: 10073_CR167
  publication-title: J Syst Cybern Inform
– volume: 39
  start-page: 15
  issue: 11
  year: 2006
  ident: 10073_CR118
  publication-title: Computer
  doi: 10.1109/MC.2006.401
– ident: 10073_CR46
  doi: 10.1145/1143844.1143891
– volume: 13
  start-page: 464
  issue: 2
  year: 2002
  ident: 10073_CR90
  publication-title: IEEE Trans Neural Netw
  doi: 10.1109/72.991432
– ident: 10073_CR22
  doi: 10.1109/ICASSP.2018.8462105
– volume: 175
  start-page: 31
  issue: 1
  year: 2017
  ident: 10073_CR160
  publication-title: Int J Comput Appl
– ident: 10073_CR89
– ident: 10073_CR91
  doi: 10.1109/APCIP.2009.212
– ident: 10073_CR80
  doi: 10.3115/100964.101006
– ident: 10073_CR3
  doi: 10.1109/MWSCAS.2002.1187258
– ident: 10073_CR64
– ident: 10073_CR92
– ident: 10073_CR116
  doi: 10.1109/ICASSP.2015.7178964
– ident: 10073_CR50
– ident: 10073_CR102
  doi: 10.1109/IADCC.2009.4808998
– ident: 10073_CR132
  doi: 10.21437/Interspeech.2009-604
– volume: 20
  start-page: 23
  issue: 1
  year: 2011
  ident: 10073_CR146
  publication-title: IEEE Trans Audio Speech Lang Process
  doi: 10.1109/TASL.2011.2129510
– volume: 1
  start-page: 93
  issue: 2
  year: 2009
  ident: 10073_CR78
  publication-title: Int J Recent Trends Eng
– ident: 10073_CR6
– volume: 21
  start-page: 36
  issue: 1
  year: 2011
  ident: 10073_CR163
  publication-title: Digital Signal Process
  doi: 10.1016/j.dsp.2010.07.004
– ident: 10073_CR175
  doi: 10.1109/ICASSP.1995.479276
– ident: 10073_CR122
– volume: 20
  start-page: 273
  issue: 3
  year: 1995
  ident: 10073_CR28
  publication-title: Mach Learn
– volume: 11
  start-page: 674
  issue: 7
  year: 1989
  ident: 10073_CR96
  publication-title: IEEE Trans Pattern Anal Mach Intell
  doi: 10.1109/34.192463
– ident: 10073_CR35
  doi: 10.1007/11760023_23
– ident: 10073_CR79
  doi: 10.21437/Interspeech.2005-237
– ident: 10073_CR82
– ident: 10073_CR99
– ident: 10073_CR76
– ident: 10073_CR47
– ident: 10073_CR43
  doi: 10.1109/MWSCAS.2003.1562377
– ident: 10073_CR83
  doi: 10.1007/3-540-45065-3_33
– ident: 10073_CR31
  doi: 10.1109/MELCON.2010.5476361
– ident: 10073_CR38
– volume: 2
  start-page: 22
  issue: 2
  year: 2009
  ident: 10073_CR108
  publication-title: Int J Recent Trends Eng
– volume: 26
  start-page: 19
  issue: 4
  year: 2011
  ident: 10073_CR5
  publication-title: Int J Comput Appl
– ident: 10073_CR70
  doi: 10.1007/978-3-540-30549-1_116
– ident: 10073_CR161
– ident: 10073_CR85
  doi: 10.21437/Eurospeech.2001-396
– ident: 10073_CR37
  doi: 10.1109/ELMAR.2006.329528
– volume: 57
  start-page: 94
  issue: 1
  year: 2014
  ident: 10073_CR62
  publication-title: Commun ACM
  doi: 10.1145/2500887
– ident: 10073_CR151
  doi: 10.1109/PACCS.2009.138
– ident: 10073_CR1
– volume: 26
  start-page: 43
  issue: 1
  year: 1978
  ident: 10073_CR138
  publication-title: IEEE Trans Acoust Speech Signal Process
  doi: 10.1109/TASSP.1978.1163055
– volume: 9
  start-page: 235
  issue: 4
  year: 2019
  ident: 10073_CR143
  publication-title: J Artif Intel Soft Comput Res
  doi: 10.2478/jaiscr-2019-0006
– ident: 10073_CR144
  doi: 10.5120/21581-4672
– volume: 58
  start-page: 5438
  issue: 12
  year: 2011
  ident: 10073_CR177
  publication-title: IEEE Trans Ind Electron
  doi: 10.1109/TIE.2011.2164773
– ident: 10073_CR94
  doi: 10.1109/ASRU46091.2019.9004036
– ident: 10073_CR130
  doi: 10.3115/1075527.1075552
– volume: 21
  start-page: 270
  issue: 2
  year: 2012
  ident: 10073_CR168
  publication-title: IEEE Trans Audio Speech Lang Process
  doi: 10.1109/TASL.2012.2221459
– ident: 10073_CR152
  doi: 10.1109/ICASSP.2010.5495097
– ident: 10073_CR173
– ident: 10073_CR135
– ident: 10073_CR77
  doi: 10.1109/ICASSP40776.2020.9053889
– ident: 10073_CR21
– volume: 11
  start-page: 644
  issue: 5
  year: 2019
  ident: 10073_CR170
  publication-title: Symmetry
  doi: 10.3390/sym11050644
– volume: 33
  start-page: 251
  issue: 3
  year: 1991
  ident: 10073_CR68
  publication-title: Technometrics
  doi: 10.1080/00401706.1991.10484833
– ident: 10073_CR52
  doi: 10.1109/ICNC.2008.666
– volume: 3
  start-page: 45
  issue: 1
  year: 2014
  ident: 10073_CR97
  publication-title: Int J Adv Res Comput Sci Electron Eng (IJARCSEE)
– ident: 10073_CR87
– volume: 2
  start-page: 223
  issue: 3
  year: 1970
  ident: 10073_CR164
  publication-title: Int J Man Mach Stud
  doi: 10.1016/S0020-7373(70)80008-6
– ident: 10073_CR149
– ident: 10073_CR136
  doi: 10.21437/Interspeech.2019-1341
– ident: 10073_CR179
  doi: 10.1109/ICASSP.1991.150344
– volume: 9
  start-page: 2088
  issue: 1
  year: 2019
  ident: 10073_CR134
  publication-title: Int J Electr Comput Eng
– volume: 43
  start-page: 59
  issue: 1
  year: 1982
  ident: 10073_CR75
  publication-title: Biol Cybern
  doi: 10.1007/BF00337288
– ident: 10073_CR147
  doi: 10.1007/978-0-387-76569-3_1
– volume-title: Another approach to polychotomous classification
  year: 1996
  ident: 10073_CR41
– ident: 10073_CR150
  doi: 10.1109/ICCCNT.2010.5591733
– volume: 7
  start-page: 25
  issue: 1
  year: 2013
  ident: 10073_CR30
  publication-title: IET Signal Proc
  doi: 10.1049/iet-spr.2012.0151
– ident: 10073_CR74
– ident: 10073_CR104
  doi: 10.1109/ICASSP.1990.115720
– volume: 2
  start-page: 479
  issue: 6
  year: 2013
  ident: 10073_CR142
  publication-title: Int J Sci Eng Technol
– ident: 10073_CR26
– volume: 56
  start-page: 85
  year: 2014
  ident: 10073_CR12
  publication-title: Speech Comm
  doi: 10.1016/j.specom.2013.07.008
– ident: 10073_CR58
  doi: 10.1117/12.836711
– volume: 9
  start-page: 803
  issue: 06
  year: 2001
  ident: 10073_CR154
  publication-title: Int J Uncertainty Fuzziness Knowledge Based Syst
  doi: 10.1142/S0218488501001253
– volume: 7
  start-page: 29
  issue: 1
  year: 1988
  ident: 10073_CR113
  publication-title: IEEE potentials
  doi: 10.1109/45.1890
– ident: 10073_CR88
– ident: 10073_CR84
  doi: 10.1109/FSKD.2011.6019893
– volume: 10
  start-page: 16
  issue: 3
  year: 2010
  ident: 10073_CR42
  publication-title: Int J Comput Appl
– ident: 10073_CR121
– ident: 10073_CR32
– ident: 10073_CR103
  doi: 10.1109/ICASSP.2001.940770
– ident: 10073_CR140
– ident: 10073_CR54
– volume: 24
  start-page: 637
  issue: 6
  year: 1952
  ident: 10073_CR33
  publication-title: J Acoust Soc Am
  doi: 10.1121/1.1906946
– ident: 10073_CR115
  doi: 10.3115/116580.116683
– ident: 10073_CR127
  doi: 10.1109/ICSAP.2010.21
– ident: 10073_CR20
  doi: 10.1109/ICASSP.2005.1415166
– ident: 10073_CR60
  doi: 10.1109/ICSMC.2011.6083880
– ident: 10073_CR171
– volume: 46
  start-page: 886
  issue: 4
  year: 1998
  ident: 10073_CR29
  publication-title: IEEE Trans Signal Process
  doi: 10.1109/78.668544
– volume: 13
  start-page: 820
  issue: 6
  year: 2005
  ident: 10073_CR169
  publication-title: IEEE Trans Fuzzy Syst
  doi: 10.1109/TFUZZ.2005.859320
– volume: 7
  start-page: 2422
  issue: 6
  year: 2016
  ident: 10073_CR86
  publication-title: Int J Comput Sci Inf Technol
– ident: 10073_CR107
  doi: 10.1109/NCC.2011.5734729
– ident: 10073_CR165
  doi: 10.1109/ETNCC.2011.5958519
– volume: 125
  start-page: EL8
  issue: 1
  year: 2009
  ident: 10073_CR44
  publication-title: J Acoust Soc Am
  doi: 10.1121/1.3040022
– volume: 13
  start-page: 415
  issue: 2
  year: 2002
  ident: 10073_CR59
  publication-title: IEEE Trans Neural Netw
  doi: 10.1109/72.991427
– volume: 4
  start-page: 994
  issue: 6
  year: 2010
  ident: 10073_CR110
  publication-title: IEEE J Sel Top Sign Proces
  doi: 10.1109/JSTSP.2010.2080812
– ident: 10073_CR155
  doi: 10.1109/ICASSP.2018.8461972
– ident: 10073_CR57
  doi: 10.1109/ICASSP.1992.225957
– ident: 10073_CR36
  doi: 10.1007/11494683_28
– volume: 3
  start-page: 3989
  issue: 3
  year: 2012
  ident: 10073_CR71
  publication-title: Int J Comput Sci Inf Technol
– ident: 10073_CR101
– ident: 10073_CR137
  doi: 10.21437/Interspeech.2015-350
– volume: 8
  start-page: 796
  issue: 5
  year: 2009
  ident: 10073_CR120
  publication-title: Inf Technol J
  doi: 10.3923/itj.2009.796.800
– volume: 41
  start-page: 2965
  issue: 10
  year: 2008
  ident: 10073_CR112
  publication-title: Pattern Recogn
  doi: 10.1016/j.patcog.2008.05.008
SSID ssj0016524
Score 2.6037211
Snippet Recently great strides have been made in the field of automatic speech recognition (ASR) by using various deep learning techniques. In this study, we present a...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 9411
SubjectTerms Automatic speech recognition
Computer Communication Networks
Computer Science
Data Structures and Information Theory
Deep learning
Feature extraction
Language modeling
Machine learning
Multimedia Information Systems
Special Purpose and Application-Based Systems
Speech recognition
Voice recognition
SummonAdditionalLinks – databaseName: SpringerLink Journals (ICM)
  dbid: U2A
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED1BWWDgo4AoFJQBsYClOBfbCVuFqCoGJip1ixzHFgNKK5Ii8e-xU6cBBEiMVhwPl5zfO93dO4BLGfOCGyaI4KEiMRokeUwtkZOU8lQap6Huqi0e-WQaP8zYzDeFVW21e5uSbG7qrtmNulYSF-64NRKxCVvMxu6ukGsajda5A878KNskJBYPqW-V-fmMr3DUccxvadEGbcb7sOtpYjBafdcD2NBlH_baEQyB98g-7HzSEzyEq9GynjcarEG10Fo9B-v6oHl5G8igWr6-6fcjmI7vn-4mxA9CIMp6SE3yKExyVbCCJxIFUoOFvRcKpVFKy44xDXPucDxhkpkiEsKEqVB2S2yMMELiMfTKealPIMDYWNDSoUGq40SrVCAyFLnTXY9SZgZAW3tkyquEu2EVL1mnb-xsmFkbNmvMxACu1-8sVhoZf-4etmbOvL9UmQ0CLUpyS4YGcNOavnv8-2mn_9t-BtuRK0ppisiG0Ktfl_rcsoo6v2h-og9qcr3-
  priority: 102
  providerName: Springer Nature
Title Automatic speech recognition: a survey
URI https://link.springer.com/article/10.1007/s11042-020-10073-7
https://www.proquest.com/docview/2494716846
Volume 80
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3NS8MwFH-47aIHP6bidI4exIsG26ZJWi-yyT5QGCIO5qmkaYIHWefWCf73Jl26qeBOJSTN4SV5v5e8934P4IIHNKWKMMSoK1CAFUZJ4GlDjnsejbgyHOom2mJIB6PgYUzG9sFtbsMqS51YKOo0E-aN_EZfE7QepRou76YfyFSNMt5VW0KjAjWtgsOwCrVOd_j0vPIjUGLL2oYu0tjo2bSZZfKcZ1JTzPXJtDFiv6FpbW_-cZEWyNPbh11rMjrt5RofwJac1GGvLMfg2NNZh50f3IKHcNle5FnBx-rMp1KKN2cVK5RNbh3uzBezT_l1BKNe9-V-gGxRBCT0aclR4rthIlKS0pBjhj2FU60jUiEx59pSxpGbUIPpIeFEpT5jyo2Y0EMCpZhiHB9DdZJN5Ak4OFAawKSrsCeDUIqIYUwwSwwHux8R1QCvlEcsLGO4KVzxHq-5jo0MYy3Doo1j1oCr1T_TJV_GxtHNUsyxPTvzeL3SDbguRb_u_n-2082zncG2bwJSigCyJlTz2UKea4siT1pQCXv9FtTavU5naL7918duy24m3Tvy29-SmceM
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT8JAEJ4gHtSDD9SIovagXnRj2213WxNjiIogyAkSbnW73Y0HA0hBw5_yN7rbB6iJ3Dxuup3DdHa-me7MNwAnzCERkS5FlJgcOVhiFDqWCuSYZRGfSc2hrqst2qTedR57bq8An3kvjC6rzH1i4qijAdf_yC9VmqD8KFFweTN8Q3pqlL5dzUdopGbRFNMPlbLF14079X1Pbbt237mto2yqAOLK3MYotE0v5JEbEY9hii2JI3XIIi4wYyrUxL4ZEg2KnstcGdmUStOnXG1xpKSSMqzkLsGygxWS68702sPs1oK42RBdz0QKia2sSSdt1bN0I4xO1vQaI_oTCOfR7a8L2QTnapuwngWoRjW1qC0oiH4JNvLhD0bmC0qw9o3JcBvOqpPxIGF_NeKhEPzFmFUmDfpXBjPiyehdTHeg-y_K2oVif9AXe2BgRyq4FKbElnA8wX2KsYtpqBnfbd-VZbByfQQ84yfXYzJegzmzstZhoHSYrHFAy3A-e2eYsnMs3F3J1RxkJzUO5nZVhotc9fPHf0vbXyztGFbqnadW0Gq0mwewautSmKR0rQLF8WgiDlUsMw6PEgMy4Pm_LfYLDaj9nw
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3NT8IwFH9RSIwe_ECNKOoO6kUbtnVrNxNjUCAqhhgjibfZdW08GEA-NPxr_nW2owM1kRvHZt07vP36Ptb3fg_giHkkIdKniBKbIw9LjGLPUYEccxwSMqk51HW1RZPctLy7Z_95Ab6yXhhdVpnZxNRQJx2u_5GXVZqg7ChR7rIsTVnEQ7V-2X1HeoKUvmnNxmmMIdIQo0-VvvUvbqvqWx-7br32dH2DzIQBxBX0Bih27SDmiZ-QgGGKHYkTdeASLjBjKuzEoR0T7SADn_kycSmVdki52uJJSSVlWMldhDzVWVEO8le15sPj5A6D-GakbmAj5Zcd07IzbtxzdFuMTt30GiP62y1OY90_17Op16uvw6oJV63KGF8bsCDaBVjLRkFYxjIUYOUHr-EmnFSGg07KBWv1u0LwV2tSp9Rpn1vM6g97H2K0Ba25qGsbcu1OW-yAhT2pnKewJXaEFwgeUox9TGPN_-6GviyCk-kj4oatXA_NeIumPMtah5HSYbrGES3C6eSd7pirY-buUqbmyJzbfjRFWRHOMtVPH_8vbXe2tENYUmiN7m-bjT1YdnVdTFrHVoLcoDcU-yqwGcQHBkEWvMwbtN9wYwNA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Automatic+speech+recognition%3A+a+survey&rft.jtitle=Multimedia+tools+and+applications&rft.au=Malik+Mishaim&rft.au=Malik%2C+Muhammad+Kamran&rft.au=Mehmood+Khawar&rft.au=Makhdoom+Imran&rft.date=2021-03-01&rft.pub=Springer+Nature+B.V&rft.issn=1380-7501&rft.eissn=1573-7721&rft.volume=80&rft.issue=6&rft.spage=9411&rft.epage=9457&rft_id=info:doi/10.1007%2Fs11042-020-10073-7&rft.externalDBID=HAS_PDF_LINK
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1380-7501&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1380-7501&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1380-7501&client=summon