Emformer: Efficient Memory Transformer Based Acoustic Model for Low Latency Streaming Speech Recognition

This paper proposes an efficient memory transformer Emformer for low latency streaming speech recognition. In Emformer, the long-range history context is distilled into an augmented memory bank to reduce self-attention's computation complexity. A cache mechanism saves the computation for the ke...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 6783 - 6787
Main Authors	Shi, Yangyang, Wang, Yongqiang, Wu, Chunyang, Yeh, Ching-Feng, Chan, Julian, Zhang, Frank, Le, Duc, Seltzer, Mike
Format	Conference Proceeding
Language	English
Published	IEEE 01.01.2021
Subjects	Acoustics Computational modeling Emformer Low Latency Memory management Signal processing Speech recognition Training Transducers Transformer
Online Access	Get full text

Cover

Loading…

Be the first to leave a comment!