Spectral entropy based feature for robust ASR
In general, entropy gives us a measure of the number of bits required to represent some information. When applied to the probability mass function (PMF), entropy can also be used to measure the "peakiness" of a distribution. We propose using the entropy of a short time Fourier transform sp...
Saved in:
Published in | 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing Vol. 1; pp. I - 193 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
Piscataway, N.J
IEEE
2004
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In general, entropy gives us a measure of the number of bits required to represent some information. When applied to the probability mass function (PMF), entropy can also be used to measure the "peakiness" of a distribution. We propose using the entropy of a short time Fourier transform spectrum, normalised as PMF, as an additional feature for automatic speech recognition (ASR). It is indeed expected that a peaky spectrum, representation of clear formant structure in the case of voiced sounds, will have low entropy, while a flatter spectrum, corresponding to nonspeech or noisy regions, will have higher entropy. Extending this reasoning further, we introduce the idea of a multiband/multiresolution entropy feature where we divide the spectrum into equal size subbands and compute entropy in each subband. The results show that multiband entropy features used in conjunction with normal cepstral features improve the performance of an ASR system. |
---|---|
ISBN: | 9780780384842 0780384849 |
ISSN: | 1520-6149 2379-190X |
DOI: | 10.1109/ICASSP.2004.1325955 |