Methods and apparatus for speech recognition using visual information

Methods and apparatus for using visual information to facilitate a speech recognition process. The method comprises dividing received audio information into a plurality of audio frames, determining for each of the plurality of audio frames, whether the audio information in the audio frame comprises...

Full description

Saved in:

Bibliographic Details
Main Authors	Vopicka, Josef, Goel, Vaibhava, Marcheret, Etienne
Format	Patent
Language	English
Published	23.10.2018
Subjects	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Methods and apparatus for using visual information to facilitate a speech recognition process. The method comprises dividing received audio information into a plurality of audio frames, determining for each of the plurality of audio frames, whether the audio information in the audio frame comprises speech from the foreground speaker, wherein the determining is based, at least in part, on received visual information, and transmitting the audio frame to an automatic speech recognition (ASR) engine for speech recognition when it is determined that the audio frame comprises speech from the foreground speaker.
Bibliography:	Application Number: US201514696803