Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning

The surge in Large Language Models (LLMs) has revolutionized natural language processing, but fine-tuning them for specific tasks often encounters challenges in balancing performance and preserving general instruction-following abilities. In this paper, we posit that the distribution gap between tas...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Yang, Zhaorui, Liu, Qian, Pang, Tianyu, Wang, Han, Feng, Haozhe, Zhu, Minfeng, Chen, Wei
Format	Paper Journal Article
Language	English
Published	Ithaca Cornell University Library, arXiv.org 21.02.2024
Subjects	Computer Science - Computation and Language Datasets Distillation Large language models Natural language processing
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The surge in Large Language Models (LLMs) has revolutionized natural language processing, but fine-tuning them for specific tasks often encounters challenges in balancing performance and preserving general instruction-following abilities. In this paper, we posit that the distribution gap between task datasets and the LLMs serves as the primary underlying cause. To address the problem, we introduce Self-Distillation Fine-Tuning (SDFT), a novel approach that bridges the distribution gap by guiding fine-tuning with a distilled dataset generated by the model itself to match its original distribution. Experimental results on the Llama-2-chat model across various benchmarks demonstrate that SDFT effectively mitigates catastrophic forgetting while achieving comparable or superior performance on downstream tasks compared to the vanilla fine-tuning. Moreover, SDFT demonstrates the potential to maintain the helpfulness and safety alignment of LLMs. Our code is available at \url{https://github.com/sail-sg/sdft}.
ISSN:	2331-8422
DOI:	10.48550/arxiv.2402.13669