Mean firing rate spike representations for speech recognition

The nature of spike coding of auditory signals is studied by comparing mean firing rate codes with conventional approaches in speech recognition tasks. Mean firing rate spike representations are problematic since most auditory nerve fibers are saturated at typical conversation levels. However, these...

Full description

Saved in:

Bibliographic Details
Published in	2010 IEEE International Symposium on Circuits and Systems (ISCAS) pp. 517 - 520
Main Authors	Harris, John G, Yukun Feng
Format	Conference Proceeding
Language	English
Published	IEEE 01.05.2010
Subjects	Automatic speech recognition Biomembranes Filter bank Mel frequency cepstral coefficient Nerve fibers Neurons Psychoacoustic models Speech processing Speech recognition Timing
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The nature of spike coding of auditory signals is studied by comparing mean firing rate codes with conventional approaches in speech recognition tasks. Mean firing rate spike representations are problematic since most auditory nerve fibers are saturated at typical conversation levels. However, these problems are eased somewhat when it is considered that there are other nerve fibers (e.g. low spontaneous firing rate fibers) that could efficiently encode the information at each channel of the cochlea. We show that window-based, mean firing rate features can be used with a crude cochlea model to achieve the same level of performance as a conventional MFCC-based approach. These results assume that there are sufficient neurons available at each channel to average the randomness of the stochastic spike trains. Furthermore, we argue that the mean firing rate features could be augmented with timing cues for further performance improvement.
ISBN:	1424453089 9781424453085
ISSN:	0271-4302 2158-1525
DOI:	10.1109/ISCAS.2010.5537579