LEARNING PERSONALIZED ENTITY PRONUNCIATIONS

Methods, systems, and apparatus, including computer programs encoded on a computer-storage medium, for implementing a pronunciation dictionary. The method includes receiving audio data corresponding to an utterance that includes a command and an entity name. Additionally, the method may include gene...

Full description

Saved in:

Bibliographic Details
Main Authors	PENG, Fuchun, BEAUFAYS, Francoise, BRUGUIER, Antoine Jean
Format	Patent
Language	English French
Published	10.08.2017
Subjects	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Methods, systems, and apparatus, including computer programs encoded on a computer-storage medium, for implementing a pronunciation dictionary. The method includes receiving audio data corresponding to an utterance that includes a command and an entity name. Additionally, the method may include generating, by an automated speech recognizer, an initial transcription for a portion of the audio data that is associated with the entity name, receiving a corrected transcription for the portion of the utterance that is associated with the entity name, obtaining a phonetic pronunciation that is associated with the portion of the audio data that is associated with the entity name, updating a pronunciation dictionary to associate the phonetic pronunciation with the entity name, receiving a subsequent utterance that includes the entity name, and transcribing the subsequent utterance based at least on the updated pronunciation dictionary. Improved speech recognition and more higher quality transcription can be provided. L'invention concerne des procédés, des systèmes et un appareil, y compris des programmes informatiques codés sur un support de stockage informatique, pour mettre en œuvre un dictionnaire de prononciations. Le procédé consiste à recevoir des données audio correspondant à un énoncé qui comprend une instruction et un nom d'entité. De plus, le procédé peut consister à générer, par un dispositif de reconnaissance vocale automatisé, une transcription initiale pour une partie des données audio qui est associée au nom d'entité, recevoir une transcription corrigée pour la partie de l'énoncé qui est associée au nom d'entité, obtenir une prononciation phonétique qui est associée à la partie des données audio qui est associée au nom d'entité, mettre à jour un dictionnaire de prononciation pour associer la prononciation phonétique au nom d'entité, recevoir un énoncé suivant qui comprend le nom d'entité, et transcrire l'énoncé suivant au moins sur la base du dictionnaire de prononciation mis à jour. Une reconnaissance vocale améliorée et une transcription de meilleure qualité peuvent être obtenues.
Bibliography:	Application Number: WO2016US63316