Emformer: Efficient Memory Transformer Based Acoustic Model for Low Latency Streaming Speech Recognition

This paper proposes an efficient memory transformer Emformer for low latency streaming speech recognition. In Emformer, the long-range history context is distilled into an augmented memory bank to reduce self-attention's computation complexity. A cache mechanism saves the computation for the ke...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 6783 - 6787
Main Authors Shi, Yangyang, Wang, Yongqiang, Wu, Chunyang, Yeh, Ching-Feng, Chan, Julian, Zhang, Frank, Le, Duc, Seltzer, Mike
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.01.2021
Subjects
Online AccessGet full text

Cover

Loading…