Two-pass end to end speech recognition
Two-pass automatic speech recognition (ASR) models can be used to perform streaming on- device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an...
Saved in:
Main Authors | , , , , , , , , , , , |
---|---|
Format | Patent |
Language | English |
Published |
16.02.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Two-pass automatic speech recognition (ASR) models can be used to perform streaming on- device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder. |
---|---|
AbstractList | Two-pass automatic speech recognition (ASR) models can be used to perform streaming on- device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder. |
Author | CHIU, Chung-Cheng PRABHAVALKAR, Rohit PANG, Ruoming VISONTAI, Mirkó RYBACH, David LIANG, Qiao WU, Yonghui SAINATH, Tara C HE, Yanzhang LI, Wei STROHMAN, Trevor MCGRAW, Ian C |
Author_xml | – fullname: SAINATH, Tara C – fullname: HE, Yanzhang – fullname: LIANG, Qiao – fullname: PANG, Ruoming – fullname: STROHMAN, Trevor – fullname: PRABHAVALKAR, Rohit – fullname: RYBACH, David – fullname: LI, Wei – fullname: VISONTAI, Mirkó – fullname: MCGRAW, Ian C – fullname: WU, Yonghui – fullname: CHIU, Chung-Cheng |
BookMark | eNrjYmDJy89L5WRQCynP1y1ILC5WSM1LUSjJB1PFBampyRkKRanJ-el5mSWZ-Xk8DKxpiTnFqbxQmptBxc01xNlDN7UgPz61uCAxOTUvtSTeMdTIAAgtLEzNTJ2cjIyJVAYAILQqWQ |
ContentType | Patent |
DBID | EVB |
DatabaseName | esp@cenet |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: EVB name: esp@cenet url: http://worldwide.espacenet.com/singleLineSearch?locale=en_EP sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Medicine Chemistry Sciences Physics |
ExternalDocumentID | AU2020288565BB2 |
GroupedDBID | EVB |
ID | FETCH-epo_espacenet_AU2020288565BB23 |
IEDL.DBID | EVB |
IngestDate | Fri Oct 25 05:38:03 EDT 2024 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-epo_espacenet_AU2020288565BB23 |
Notes | Application Number: AU20200288565 |
OpenAccessLink | https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230216&DB=EPODOC&CC=AU&NR=2020288565B2 |
ParticipantIDs | epo_espacenet_AU2020288565BB2 |
PublicationCentury | 2000 |
PublicationDate | 20230216 |
PublicationDateYYYYMMDD | 2023-02-16 |
PublicationDate_xml | – month: 02 year: 2023 text: 20230216 day: 16 |
PublicationDecade | 2020 |
PublicationYear | 2023 |
RelatedCompanies | GOOGLE LLC |
RelatedCompanies_xml | – name: GOOGLE LLC |
Score | 3.4559872 |
Snippet | Two-pass automatic speech recognition (ASR) models can be used to perform streaming on- device ASR to generate a text representation of an utterance captured... |
SourceID | epo |
SourceType | Open Access Repository |
SubjectTerms | ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION |
Title | Two-pass end to end speech recognition |
URI | https://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230216&DB=EPODOC&locale=&CC=AU&NR=2020288565B2 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV1La8MwDBale962bGOvjhxKbmZ5usmhjOVFGfTBSEdvpbFdNhhOWDL696eYtOtpOxlkkG2BpM_2Zwugjxi4uQxjJPBMTlzqcxIE9ooIM7c9f2U5TJUDGk_oaO6-LLxFBz63b2HUP6Eb9TkiehRDf69VvC5_D7Fixa2sHvMPFBVPaTaMjXZ3jHjatqgRh8NkNo2nkRFFuJM0Jq9NH2ZSH-FLiAH7AIH0oCGAJW9h8y6l3E8q6RkczlCfrM-hI6QGJ9G29poGx-P2yluDI8XRZBUKWz-sLsDINgUpEffqQnK9LlRTlUKwd31HCSrkJfTTJItGBIde7la6fJ7vzTO0nSvoykKKa9Ad4buMCYuuXcel1F7xATMpYk6Hrynl-Q30_lR1-0__HZw25mv4yBa9h2799S16mG7r_EFZ6Qe5eoDz |
link.rule.ids | 230,309,783,888,25578,76884 |
linkProvider | European Patent Office |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV3dT4MwEL8s82O-KWrmx5SHhTcinxUeFiOwBXWwxTCzNwKliyYGiGD273s0bO5Jn5q0ybW95O5-bX-9AxgiBm4ew6hsm0omG8TKZNvWEpkpqWZaiapTXg4oCIm_MJ6X5rIDn5u_MDxP6JonR0SLomjvNffX5e8llse5ldVd-oFdxcMkGnlSezpGPK2pRPKc0Xg-82au5Lp4kpTC12YMI6mF8MVBh72HINtqMu2P35zmX0q5G1Qmx7A_R3l5fQIdlgvQcze11wQ4DNonbwEOOEeTVtjZ2mF1ClK0LuQSca_I8kysC95UJWP0XdxSgor8DIaTceT6Mk4db3caPy521ulo-jl08yJnfRB1ZhmUMpWsDN0gREuye6oQxJx6tiIkSy9g8Keoy3_Gb6HnR8E0nj6FL1dw1Kiy4Sar5Bq69dc3G2DordMbrrEf7XKD4w |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Apatent&rft.title=Two-pass+end+to+end+speech+recognition&rft.inventor=SAINATH%2C+Tara+C&rft.inventor=HE%2C+Yanzhang&rft.inventor=LIANG%2C+Qiao&rft.inventor=PANG%2C+Ruoming&rft.inventor=STROHMAN%2C+Trevor&rft.inventor=PRABHAVALKAR%2C+Rohit&rft.inventor=RYBACH%2C+David&rft.inventor=LI%2C+Wei&rft.inventor=VISONTAI%2C+Mirk%C3%B3&rft.inventor=MCGRAW%2C+Ian+C&rft.inventor=WU%2C+Yonghui&rft.inventor=CHIU%2C+Chung-Cheng&rft.date=2023-02-16&rft.externalDBID=B2&rft.externalDocID=AU2020288565BB2 |