Comparison of DTW and HMM for isolated word recognition
This study proposes limited vocabulary isolated word recognition using Linear Predictive Coding(LPC) and Mel Frequency Cepstral Coefficients(MFCC) for feature extraction, Dynamic Time Warping(DTW) and discrete Hidden Markov Model (HMM) for recognition and their comparisons. Feature extraction is car...
Saved in:
Published in | 2012 International Conference on Pattern Recognition, Informatics and Medical Engineering pp. 466 - 470 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.03.2012
|
Subjects | |
Online Access | Get full text |
ISBN | 1467310379 9781467310376 |
DOI | 10.1109/ICPRIME.2012.6208391 |
Cover
Loading…
Abstract | This study proposes limited vocabulary isolated word recognition using Linear Predictive Coding(LPC) and Mel Frequency Cepstral Coefficients(MFCC) for feature extraction, Dynamic Time Warping(DTW) and discrete Hidden Markov Model (HMM) for recognition and their comparisons. Feature extraction is carried over the speech frame of 300 samples with 100 samples overlap at 8 KHz sampling rate of the input speech. MFCC analysis provides better recognition rate than LPC as it operates on a logarithmic scale which resembles human auditory system whereas LPC has uniform resolution over the frequency plane. This is followed by pattern recognition. Since the voice signal tends to have different temporal rate, DTW is one of the methods that provide non-linear alignment between two voice signals. Another method called HMM that statistically models the words is also presented. Experimentally it is observed that recognition accuracy is better for HMM compared with DTW. The database used is TI-46 isolated word corpus zero-nine from Linguist Data Consortium. |
---|---|
AbstractList | This study proposes limited vocabulary isolated word recognition using Linear Predictive Coding(LPC) and Mel Frequency Cepstral Coefficients(MFCC) for feature extraction, Dynamic Time Warping(DTW) and discrete Hidden Markov Model (HMM) for recognition and their comparisons. Feature extraction is carried over the speech frame of 300 samples with 100 samples overlap at 8 KHz sampling rate of the input speech. MFCC analysis provides better recognition rate than LPC as it operates on a logarithmic scale which resembles human auditory system whereas LPC has uniform resolution over the frequency plane. This is followed by pattern recognition. Since the voice signal tends to have different temporal rate, DTW is one of the methods that provide non-linear alignment between two voice signals. Another method called HMM that statistically models the words is also presented. Experimentally it is observed that recognition accuracy is better for HMM compared with DTW. The database used is TI-46 isolated word corpus zero-nine from Linguist Data Consortium. |
Author | Sajjan, S. C. Vijaya, C. |
Author_xml | – sequence: 1 givenname: S. C. surname: Sajjan fullname: Sajjan, S. C. email: sharadasajjan@yahoo.com organization: Dept. of Electron. & Commun. Eng., SDM Coll. of Eng. & Technol., Dharwad, India – sequence: 2 givenname: C. surname: Vijaya fullname: Vijaya, C. email: vijayc26@yahoo.com organization: Dept. of Electron. & Commun. Eng., SDM Coll. of Eng. & Technol., Dharwad, India |
BookMark | eNpFj99KwzAchSNO0M09gV7kBVrza9L8uZS6ucKKQwpejqRJJLIlIy2Ib-_Agefm8J2LD84czWKKDqFHICUAUU9ts3tvu1VZEahKXhFJFVyhOTAuKBCq6ut_EOoWLcfxi5wjCJXA7pBo0vGkcxhTxMnjl_4D62jxpuuwTxmf94OenMXfKVuc3ZA-Y5hCivfoxuvD6JaXXqB-veqbTbF9e22b520RFJmKQWhurBOcWytJbYyiWnjGjPSm4oQ7I0GCsMpSLqmoBbXCD8At08ayWtEFevjTBufc_pTDUeef_eUn_QUJvUgx |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ICPRIME.2012.6208391 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 1467310395 9781467310383 9781467310390 1467310387 |
EndPage | 470 |
ExternalDocumentID | 6208391 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ADFMO ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK IEGSK IERZE OCL RIE RIL |
ID | FETCH-LOGICAL-i90t-c7a6bde766dd805bb93a7f44b8fb2606eb81817d9d36837573d7fc16d4abd4593 |
IEDL.DBID | RIE |
ISBN | 1467310379 9781467310376 |
IngestDate | Wed Aug 27 04:05:40 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i90t-c7a6bde766dd805bb93a7f44b8fb2606eb81817d9d36837573d7fc16d4abd4593 |
PageCount | 5 |
ParticipantIDs | ieee_primary_6208391 |
PublicationCentury | 2000 |
PublicationDate | 2012-March |
PublicationDateYYYYMMDD | 2012-03-01 |
PublicationDate_xml | – month: 03 year: 2012 text: 2012-March |
PublicationDecade | 2010 |
PublicationTitle | 2012 International Conference on Pattern Recognition, Informatics and Medical Engineering |
PublicationTitleAbbrev | ICPRIME |
PublicationYear | 2012 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0000703814 |
Score | 1.5445098 |
Snippet | This study proposes limited vocabulary isolated word recognition using Linear Predictive Coding(LPC) and Mel Frequency Cepstral Coefficients(MFCC) for feature... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 466 |
SubjectTerms | Accuracy Dynamic Time Warping (DTW) Feature extraction Hidden Markov Model (HMM) Hidden Markov models Linear Predictive Coding(LPC) Mel frequency cepstral coefficient Mel Frequency Cepstral Coefficients (MFCC) Speech Speech recognition Vectors |
Title | Comparison of DTW and HMM for isolated word recognition |
URI | https://ieeexplore.ieee.org/document/6208391 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA5zJ08qm_ibHDyarl3SpDnPjU2oDJm422iSVxClFekQ_Ot9abuK4sFbm0NJ2rTf917f9z5CrrXhoXZWMSSrkgkNMUtyJxmCG3AeAbd1aiC9l_NHcbeO1z1y02lhAKAuPoPAH9b_8l1ptz5VNpJjJAxeqr6H26zRanX5FL91k9rFyb_73j1L6V1Lp_ZcttK5KNSjxWT5sEinvrZrHLTX_WGwUuPL7ICku5k1ZSUvwbYygf381bTxv1M_JMNvJR9ddhh1RHpQDIiadOaDtMzp7eqJZoWj8zSlSGEpjr8iA3X0AwNT2lUYlcWQrGbT1WTOWgMF9qzDilmVSeNASelcEsbGaJ6pXAiT5AbDGAkG0TpSTjsuMU6NFXcqt5F0IjNOxJofk35RFnBCKBihPdUCOQaMkXITS4EfRpMZmyGJgVMy8GvevDUtMjbtcs_-Hj4n-_6-N6VcF6RfvW_hErG9Mlf1Q_0CoqOeqQ |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT4MwFG6WedCTms342x48CoO1tPQ8t2w6lsVg3G2h9JEYDRjDYuJf7yswjMaDN-iBtGnh-97j-94j5Fpp5imTSgfJqnC4gsAJMyMcBDdgzAeWVqmBaCGmj_xuFaw65Kb1wgBAJT4D115W__JNkW5sqmwghkgYrFV9B3GfB7Vbq82o2MMbVn2c7Ntv-2dJtS3q1NyLxjzne2owGy0fZtHYqruGbvPkHy1WKoSZ7JNoO7daWPLibkrtpp-_yjb-d_IHpP_t5aPLFqUOSQfyHpGjtv0gLTJ6Gz_RJDd0GkUUSSzF8VfkoIZ-YGhKW41RkfdJPBnHo6nTtFBwnpVXOqlMhDYghTAm9AKtFUtkxrkOM42BjACNeO1LowwTGKkGkhmZpb4wPNGGB4odkW5e5HBMKGiuLNkCMQSMkjIdCI6fRp3oNEEaAyekZ9e8fquLZKyb5Z7-PXxFdqdxNF_PZ4v7M7Jn96AWdp2Tbvm-gQtE-lJfVhv8BSleofY |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+International+Conference+on+Pattern+Recognition%2C+Informatics+and+Medical+Engineering&rft.atitle=Comparison+of+DTW+and+HMM+for+isolated+word+recognition&rft.au=Sajjan%2C+S.+C.&rft.au=Vijaya%2C+C.&rft.date=2012-03-01&rft.pub=IEEE&rft.isbn=9781467310376&rft.spage=466&rft.epage=470&rft_id=info:doi/10.1109%2FICPRIME.2012.6208391&rft.externalDocID=6208391 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467310376/lc.gif&client=summon&freeimage=true |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467310376/mc.gif&client=summon&freeimage=true |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467310376/sc.gif&client=summon&freeimage=true |