Spectral entropy based feature for robust ASR

In general, entropy gives us a measure of the number of bits required to represent some information. When applied to the probability mass function (PMF), entropy can also be used to measure the "peakiness" of a distribution. We propose using the entropy of a short time Fourier transform sp...

Full description

Saved in:
Bibliographic Details
Published in2004 IEEE International Conference on Acoustics, Speech, and Signal Processing Vol. 1; pp. I - 193
Main Authors Misra, H., Ikbal, S., Bourlard, H., Hermansky, H.
Format Conference Proceeding
LanguageEnglish
Published Piscataway, N.J IEEE 2004
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In general, entropy gives us a measure of the number of bits required to represent some information. When applied to the probability mass function (PMF), entropy can also be used to measure the "peakiness" of a distribution. We propose using the entropy of a short time Fourier transform spectrum, normalised as PMF, as an additional feature for automatic speech recognition (ASR). It is indeed expected that a peaky spectrum, representation of clear formant structure in the case of voiced sounds, will have low entropy, while a flatter spectrum, corresponding to nonspeech or noisy regions, will have higher entropy. Extending this reasoning further, we introduce the idea of a multiband/multiresolution entropy feature where we divide the spectrum into equal size subbands and compute entropy in each subband. The results show that multiband entropy features used in conjunction with normal cepstral features improve the performance of an ASR system.
ISBN:9780780384842
0780384849
ISSN:1520-6149
2379-190X
DOI:10.1109/ICASSP.2004.1325955