Flickering Reduction with Partial Hypothesis Re-ranking for Streaming ASR

A method includes processing, using a speech recognizer, a first portion of audio data to generate a first lattice, and generating a first partial transcription for an utterance based on the first lattice. The method includes processing, using the recognizer, a second portion of the data to generate...

Full description

Saved in:
Bibliographic Details
Main Authors Bruguier, Antoine Jean, Qiu, David, Strohman, Trevor, He, Yangzhang
Format Patent
LanguageEnglish
Published 25.01.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A method includes processing, using a speech recognizer, a first portion of audio data to generate a first lattice, and generating a first partial transcription for an utterance based on the first lattice. The method includes processing, using the recognizer, a second portion of the data to generate, based on the first lattice, a second lattice representing a plurality of partial speech recognition hypotheses for the utterance and a plurality of corresponding speech recognition scores. For each particular partial speech recognition hypothesis, the method includes generating a corresponding re-ranked score based on the corresponding speech recognition score and whether the particular partial speech recognition hypothesis shares a prefix with the first partial transcription. The method includes generating a second partial transcription for the utterance by selecting the partial speech recognition hypothesis of the second plurality of partial speech recognition hypotheses having the highest corresponding re-ranked score.
Bibliography:Application Number: US202318352211