Methods and apparatus for speech recognition using visual information

Methods and apparatus for using visual information to facilitate a speech recognition process. The method comprises dividing received audio information into a plurality of audio frames, determining for each of the plurality of audio frames, whether the audio information in the audio frame comprises...

Full description

Saved in:
Bibliographic Details
Main Authors Vopicka, Josef, Goel, Vaibhava, Marcheret, Etienne
Format Patent
LanguageEnglish
Published 23.10.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Methods and apparatus for using visual information to facilitate a speech recognition process. The method comprises dividing received audio information into a plurality of audio frames, determining for each of the plurality of audio frames, whether the audio information in the audio frame comprises speech from the foreground speaker, wherein the determining is based, at least in part, on received visual information, and transmitting the audio frame to an automatic speech recognition (ASR) engine for speech recognition when it is determined that the audio frame comprises speech from the foreground speaker.
Bibliography:Application Number: US201514696803