Leveraging unpaired text data for training end-to-end spoken language understanding systems

An illustrative embodiment includes a method for training an end-to-end (E2E) spoken language understanding (SLU) system. The method includes receiving a training corpus comprising a set of text classified using one or more sets of semantic labels but unpaired with speech and using the set of unpair...

Full description

Saved in:
Bibliographic Details
Main Authors Thomas, Samuel, Picheny, Michael Alan, Huang, Yinghui, Audhkhasi, Kartik, Kuo, Hong-Kwang Jeff
Format Patent
LanguageEnglish
Published 21.02.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:An illustrative embodiment includes a method for training an end-to-end (E2E) spoken language understanding (SLU) system. The method includes receiving a training corpus comprising a set of text classified using one or more sets of semantic labels but unpaired with speech and using the set of unpaired text to train the E2E SLU system to classify speech using at least one of the one or more sets of semantic labels. The method may include training a text-to-intent model using the set of unpaired text; and training a speech-to-intent model using the text-to-intent model. Alternatively or additionally, the method may include using a text-to-speech (TTS) system to generate synthetic speech from the unpaired text; and training the E2E SLU system using the synthetic speech.
Bibliography:Application Number: US202016841787