Speech recognition accuracy with natural-language understanding based meta-speech systems for assistant systems

In one embodiment, a method includes receiving, from a client system associated with a first user, a first audio input. The method includes generating multiple transcriptions corresponding to the first audio input based on multiple automatic speech recognition (ASR) engines. Each ASR engine is assoc...

Full description

Saved in:

Bibliographic Details
Main Authors	Peng, Fuchun, Li, Jihang, Yu, Jinsong
Format	Patent
Language	English
Published	02.08.2022
Subjects	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FORADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORYOR FORECASTING PURPOSES ELECTRIC COMMUNICATION TECHNIQUE ELECTRIC DIGITAL DATA PROCESSING ELECTRICITY MUSICAL INSTRUMENTS PHYSICS PICTORIAL COMMUNICATION, e.g. TELEVISION SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE,COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTINGPURPOSES, NOT OTHERWISE PROVIDED FOR TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHICCOMMUNICATION
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In one embodiment, a method includes receiving, from a client system associated with a first user, a first audio input. The method includes generating multiple transcriptions corresponding to the first audio input based on multiple automatic speech recognition (ASR) engines. Each ASR engine is associated with a respective domain out of multiple domains. The method includes determining, for each transcription, a combination of one or more intents and one or more slots to be associated with the transcription. The method includes selecting, by a meta-speech engine, one or more combinations of intents and slots from the multiple combinations to be associated with the first user input. The method includes generating a response to the first audio input based on the selected combinations and sending, to the client system, instructions for presenting the response to the first audio input.
Bibliography:	Application Number: US202016741642