Comparison of DTW and HMM for isolated word recognition

This study proposes limited vocabulary isolated word recognition using Linear Predictive Coding(LPC) and Mel Frequency Cepstral Coefficients(MFCC) for feature extraction, Dynamic Time Warping(DTW) and discrete Hidden Markov Model (HMM) for recognition and their comparisons. Feature extraction is car...

Full description

Saved in:
Bibliographic Details
Published in2012 International Conference on Pattern Recognition, Informatics and Medical Engineering pp. 466 - 470
Main Authors Sajjan, S. C., Vijaya, C.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.03.2012
Subjects
Online AccessGet full text
ISBN1467310379
9781467310376
DOI10.1109/ICPRIME.2012.6208391

Cover

Loading…
Abstract This study proposes limited vocabulary isolated word recognition using Linear Predictive Coding(LPC) and Mel Frequency Cepstral Coefficients(MFCC) for feature extraction, Dynamic Time Warping(DTW) and discrete Hidden Markov Model (HMM) for recognition and their comparisons. Feature extraction is carried over the speech frame of 300 samples with 100 samples overlap at 8 KHz sampling rate of the input speech. MFCC analysis provides better recognition rate than LPC as it operates on a logarithmic scale which resembles human auditory system whereas LPC has uniform resolution over the frequency plane. This is followed by pattern recognition. Since the voice signal tends to have different temporal rate, DTW is one of the methods that provide non-linear alignment between two voice signals. Another method called HMM that statistically models the words is also presented. Experimentally it is observed that recognition accuracy is better for HMM compared with DTW. The database used is TI-46 isolated word corpus zero-nine from Linguist Data Consortium.
AbstractList This study proposes limited vocabulary isolated word recognition using Linear Predictive Coding(LPC) and Mel Frequency Cepstral Coefficients(MFCC) for feature extraction, Dynamic Time Warping(DTW) and discrete Hidden Markov Model (HMM) for recognition and their comparisons. Feature extraction is carried over the speech frame of 300 samples with 100 samples overlap at 8 KHz sampling rate of the input speech. MFCC analysis provides better recognition rate than LPC as it operates on a logarithmic scale which resembles human auditory system whereas LPC has uniform resolution over the frequency plane. This is followed by pattern recognition. Since the voice signal tends to have different temporal rate, DTW is one of the methods that provide non-linear alignment between two voice signals. Another method called HMM that statistically models the words is also presented. Experimentally it is observed that recognition accuracy is better for HMM compared with DTW. The database used is TI-46 isolated word corpus zero-nine from Linguist Data Consortium.
Author Sajjan, S. C.
Vijaya, C.
Author_xml – sequence: 1
  givenname: S. C.
  surname: Sajjan
  fullname: Sajjan, S. C.
  email: sharadasajjan@yahoo.com
  organization: Dept. of Electron. & Commun. Eng., SDM Coll. of Eng. & Technol., Dharwad, India
– sequence: 2
  givenname: C.
  surname: Vijaya
  fullname: Vijaya, C.
  email: vijayc26@yahoo.com
  organization: Dept. of Electron. & Commun. Eng., SDM Coll. of Eng. & Technol., Dharwad, India
BookMark eNpFj99KwzAchSNO0M09gV7kBVrza9L8uZS6ucKKQwpejqRJJLIlIy2Ib-_Agefm8J2LD84czWKKDqFHICUAUU9ts3tvu1VZEahKXhFJFVyhOTAuKBCq6ut_EOoWLcfxi5wjCJXA7pBo0vGkcxhTxMnjl_4D62jxpuuwTxmf94OenMXfKVuc3ZA-Y5hCivfoxuvD6JaXXqB-veqbTbF9e22b520RFJmKQWhurBOcWytJbYyiWnjGjPSm4oQ7I0GCsMpSLqmoBbXCD8At08ayWtEFevjTBufc_pTDUeef_eUn_QUJvUgx
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICPRIME.2012.6208391
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1467310395
9781467310383
9781467310390
1467310387
EndPage 470
ExternalDocumentID 6208391
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ADFMO
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
IERZE
OCL
RIE
RIL
ID FETCH-LOGICAL-i90t-c7a6bde766dd805bb93a7f44b8fb2606eb81817d9d36837573d7fc16d4abd4593
IEDL.DBID RIE
ISBN 1467310379
9781467310376
IngestDate Wed Aug 27 04:05:40 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i90t-c7a6bde766dd805bb93a7f44b8fb2606eb81817d9d36837573d7fc16d4abd4593
PageCount 5
ParticipantIDs ieee_primary_6208391
PublicationCentury 2000
PublicationDate 2012-March
PublicationDateYYYYMMDD 2012-03-01
PublicationDate_xml – month: 03
  year: 2012
  text: 2012-March
PublicationDecade 2010
PublicationTitle 2012 International Conference on Pattern Recognition, Informatics and Medical Engineering
PublicationTitleAbbrev ICPRIME
PublicationYear 2012
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0000703814
Score 1.5445098
Snippet This study proposes limited vocabulary isolated word recognition using Linear Predictive Coding(LPC) and Mel Frequency Cepstral Coefficients(MFCC) for feature...
SourceID ieee
SourceType Publisher
StartPage 466
SubjectTerms Accuracy
Dynamic Time Warping (DTW)
Feature extraction
Hidden Markov Model (HMM)
Hidden Markov models
Linear Predictive Coding(LPC)
Mel frequency cepstral coefficient
Mel Frequency Cepstral Coefficients (MFCC)
Speech
Speech recognition
Vectors
Title Comparison of DTW and HMM for isolated word recognition
URI https://ieeexplore.ieee.org/document/6208391
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA5zJ08qm_ibHDyarl3SpDnPjU2oDJm422iSVxClFekQ_Ot9abuK4sFbm0NJ2rTf917f9z5CrrXhoXZWMSSrkgkNMUtyJxmCG3AeAbd1aiC9l_NHcbeO1z1y02lhAKAuPoPAH9b_8l1ptz5VNpJjJAxeqr6H26zRanX5FL91k9rFyb_73j1L6V1Lp_ZcttK5KNSjxWT5sEinvrZrHLTX_WGwUuPL7ICku5k1ZSUvwbYygf381bTxv1M_JMNvJR9ddhh1RHpQDIiadOaDtMzp7eqJZoWj8zSlSGEpjr8iA3X0AwNT2lUYlcWQrGbT1WTOWgMF9qzDilmVSeNASelcEsbGaJ6pXAiT5AbDGAkG0TpSTjsuMU6NFXcqt5F0IjNOxJofk35RFnBCKBihPdUCOQaMkXITS4EfRpMZmyGJgVMy8GvevDUtMjbtcs_-Hj4n-_6-N6VcF6RfvW_hErG9Mlf1Q_0CoqOeqQ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT4MwFG6WedCTms342x48CoO1tPQ8t2w6lsVg3G2h9JEYDRjDYuJf7yswjMaDN-iBtGnh-97j-94j5Fpp5imTSgfJqnC4gsAJMyMcBDdgzAeWVqmBaCGmj_xuFaw65Kb1wgBAJT4D115W__JNkW5sqmwghkgYrFV9B3GfB7Vbq82o2MMbVn2c7Ntv-2dJtS3q1NyLxjzne2owGy0fZtHYqruGbvPkHy1WKoSZ7JNoO7daWPLibkrtpp-_yjb-d_IHpP_t5aPLFqUOSQfyHpGjtv0gLTJ6Gz_RJDd0GkUUSSzF8VfkoIZ-YGhKW41RkfdJPBnHo6nTtFBwnpVXOqlMhDYghTAm9AKtFUtkxrkOM42BjACNeO1LowwTGKkGkhmZpb4wPNGGB4odkW5e5HBMKGiuLNkCMQSMkjIdCI6fRp3oNEEaAyekZ9e8fquLZKyb5Z7-PXxFdqdxNF_PZ4v7M7Jn96AWdp2Tbvm-gQtE-lJfVhv8BSleofY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+International+Conference+on+Pattern+Recognition%2C+Informatics+and+Medical+Engineering&rft.atitle=Comparison+of+DTW+and+HMM+for+isolated+word+recognition&rft.au=Sajjan%2C+S.+C.&rft.au=Vijaya%2C+C.&rft.date=2012-03-01&rft.pub=IEEE&rft.isbn=9781467310376&rft.spage=466&rft.epage=470&rft_id=info:doi/10.1109%2FICPRIME.2012.6208391&rft.externalDocID=6208391
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467310376/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467310376/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467310376/sc.gif&client=summon&freeimage=true