Condensed Sample-Guided Model Inversion for Knowledge Distillation

Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issue...

Full description

Saved in:

Bibliographic Details
Main Authors	Binici, Kuluhan, Aggarwal, Shivam, Acar, Cihan, Pham, Nam Trung, Leman, Karianto, Lee, Gim Hee, Mitra, Tulika
Format	Journal Article
Language	English
Published	25.08.2024
Subjects	Computer Science - Artificial Intelligence Computer Science - Learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, generated through model inversion, to mimic the target data distribution. However, conventional model inversion methods are not designed to utilize supplementary information from the target dataset, and thus, cannot leverage it to improve performance, even when it is available. In this paper, we consider condensed samples, as a form of supplementary information, and introduce a method for using them to better approximate the target data distribution, thereby enhancing the KD performance. Our approach is versatile, evidenced by improvements of up to 11.4% in KD accuracy across various datasets and model inversion-based methods. Importantly, it remains effective even when using as few as one condensed sample per class, and can also enhance performance in few-shot scenarios where only limited real data samples are available.
DOI:	10.48550/arxiv.2408.13850