MoRe Fine-Tuning with 10x Fewer Parameters
Parameter-efficient fine-tuning (PEFT) techniques have unlocked the potential to cheaply and easily specialize large pretrained models. However, the most prominent approaches, like low-rank adapters (LoRA), depend on heuristics or rules-of-thumb for their architectural choices -- potentially limitin...
Saved in:
Main Authors | , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
30.08.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Parameter-efficient fine-tuning (PEFT) techniques have unlocked the potential
to cheaply and easily specialize large pretrained models. However, the most
prominent approaches, like low-rank adapters (LoRA), depend on heuristics or
rules-of-thumb for their architectural choices -- potentially limiting their
performance for new models and architectures. This limitation suggests that
techniques from neural architecture search could be used to obtain optimal
adapter architectures, but these are often expensive and difficult to
implement. We address this challenge with Monarch Rectangular Fine-tuning
(MoRe), a simple framework to search over adapter architectures that relies on
the Monarch matrix class. Theoretically, we show that MoRe is more expressive
than LoRA. Empirically, our approach is more parameter-efficient and performant
than state-of-the-art PEFTs on a range of tasks and models, with as few as 5\%
of LoRA's parameters. |
---|---|
DOI: | 10.48550/arxiv.2408.17383 |