Can Mamba Always Enjoy the "Free Lunch"?
Transformers have been the cornerstone of current Large Language Models (LLMs); however, its linear growth in overhead during inference with respect to sequence length poses challenges for modeling long sequences. In this context, Mamba has gradually attracted attention due to its constant-level siz...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
04.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!