Condensed Sample-Guided Model Inversion for Knowledge Distillation
Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issue...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
25.08.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Knowledge distillation (KD) is a key element in neural network compression
that allows knowledge transfer from a pre-trained teacher model to a more
compact student model. KD relies on access to the training dataset, which may
not always be fully available due to privacy concerns or logistical issues
related to the size of the data. To address this, "data-free" KD methods use
synthetic data, generated through model inversion, to mimic the target data
distribution. However, conventional model inversion methods are not designed to
utilize supplementary information from the target dataset, and thus, cannot
leverage it to improve performance, even when it is available. In this paper,
we consider condensed samples, as a form of supplementary information, and
introduce a method for using them to better approximate the target data
distribution, thereby enhancing the KD performance. Our approach is versatile,
evidenced by improvements of up to 11.4% in KD accuracy across various datasets
and model inversion-based methods. Importantly, it remains effective even when
using as few as one condensed sample per class, and can also enhance
performance in few-shot scenarios where only limited real data samples are
available. |
---|---|
DOI: | 10.48550/arxiv.2408.13850 |