Speech Recognition with Amplitude and Frequency Modulations

Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the National Academy of Sciences - PNAS Vol. 102; no. 7; pp. 2293 - 2298
Main Authors	Zeng, Fan-Gang, Nie, Kaibao, Stickney, Ginger S., Kong, Ying-Yee, Vongphoe, Michael, Bhargave, Ashish, Wei, Chaogang, Cao, Keli, Merzenich, Michael M.
Format	Journal Article
Language	English
Published	United States National Academy of Sciences 15.02.2005 National Acad Sciences
Subjects	Audio equipment Automatic speech recognition Biological Sciences Cochlear Implants Comparative analysis Deafness - psychology Deafness - therapy Engineering Female Frequency modulation Humans Line spectra Listening Male Neurology Perceptual Masking Physical Sciences Radio frequency Signal noise Spectral bands Speech Acoustics Speech Perception Speech recognition Supernova remnants Theatrical masks Voice recognition
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance.
Bibliography:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23 ObjectType-Article-1 ObjectType-Feature-2 Author contributions: F.-G.Z. designed research; F.-G.Z., K.N., G.S.S., Y.-Y.K., M.V., A.B., C.W., and K.C. performed research; F.-G.Z., K.N., G.S.S., Y.-Y.K., M.V., C.W., and K.C. analyzed data; and F.-G.Z. wrote the paper. To whom correspondence should be addressed. E-mail: fzeng@uci.edu. This paper was submitted directly (Track II) to the PNAS office. Abbreviations: AM, amplitude modulation; FM, frequency modulation; SRT, speech reception threshold; SNR, signal-to-noise ratio; IEEE, Institute of Electrical and Electronic Engineers; CUNY, City University of New York; HINT, Hearing in Noise Test. Edited by Michael M. Merzenich, University of California, San Francisco, CA
ISSN:	0027-8424 1091-6490
DOI:	10.1073/pnas.0406460102