Speaker normalization for Chinese vowel recognition in cochlear implants

Because of the limited spectra-temporal resolution associated with cochlear implants, implant patients often have greater difficulty with multitalker speech recognition. The present study investigated whether multitalker speech recognition can be improved by applying speaker normalization techniques...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on biomedical engineering Vol. 52; no. 7; pp. 1358 - 1361
Main Authors	Luo, X., Fu, Q.-J.
Format	Journal Article
Language	English
Published	United States IEEE 01.07.2005 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Artificial Intelligence Auditory implants China Cochlear Implants Computer-Aided Design Equipment Failure Analysis Filter bank Frequency Humans Pattern analysis Pattern matching Pattern recognition Phonation Prosthesis Design Sound Spectrography - methods speaker normalization Speech Acoustics Speech Perception Speech processing Speech recognition Speech Recognition Software Testing Transplants & implants vowel recognition China
Online Access	Get full text
ISSN	0018-9294 1558-2531
DOI	10.1109/TBME.2005.847530

Cover

More Information
Summary:	Because of the limited spectra-temporal resolution associated with cochlear implants, implant patients often have greater difficulty with multitalker speech recognition. The present study investigated whether multitalker speech recognition can be improved by applying speaker normalization techniques to cochlear implant speech processing. Multitalker Chinese vowel recognition was tested with normal-hearing Chinese-speaking subjects listening to a 4-channel cochlear implant simulation, with and without speaker normalization. For each subject, speaker normalization was referenced to the speaker that produced the best recognition performance under conditions without speaker normalization. To match the remaining speakers to this "optimal" output pattern, the overall frequency range of the analysis filter bank was adjusted for each speaker according to the ratio of the mean third formant frequency values between the specific speaker and the reference speaker. Results showed that speaker normalization provided a small but significant improvement in subjects' overall recognition performance. After speaker normalization, subjects' patterns of recognition performance across speakers changed, demonstrating the potential for speaker-dependent effects with the proposed normalization technique.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 content type line 23 ObjectType-Article-1 ObjectType-Feature-2
ISSN:	0018-9294 1558-2531
DOI:	10.1109/TBME.2005.847530