Can Mamba Always Enjoy the "Free Lunch"?

Transformers have been the cornerstone of current Large Language Models (LLMs); however, its linear growth in overhead during inference with respect to sequence length poses challenges for modeling long sequences. In this context, Mamba has gradually attracted attention due to its constant-level siz...

Full description

Saved in:
Bibliographic Details
Main Authors Ren, Ruifeng, Li, Zhicong, Liu, Yong
Format Journal Article
LanguageEnglish
Published 04.10.2024
Subjects
Online AccessGet full text

Cover

Loading…