Condensed Sample-Guided Model Inversion for Knowledge Distillation

Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issue...

Full description

Saved in:

Bibliographic Details
Main Authors	Binici, Kuluhan, Aggarwal, Shivam, Acar, Cihan, Pham, Nam Trung, Leman, Karianto, Lee, Gim Hee, Mitra, Tulika
Format	Journal Article
Language	English
Published	25.08.2024
Subjects	Computer Science - Artificial Intelligence Computer Science - Learning
Online Access	Get full text

Cover

Loading…

Abstract	Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, generated through model inversion, to mimic the target data distribution. However, conventional model inversion methods are not designed to utilize supplementary information from the target dataset, and thus, cannot leverage it to improve performance, even when it is available. In this paper, we consider condensed samples, as a form of supplementary information, and introduce a method for using them to better approximate the target data distribution, thereby enhancing the KD performance. Our approach is versatile, evidenced by improvements of up to 11.4% in KD accuracy across various datasets and model inversion-based methods. Importantly, it remains effective even when using as few as one condensed sample per class, and can also enhance performance in few-shot scenarios where only limited real data samples are available.
AbstractList	Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, generated through model inversion, to mimic the target data distribution. However, conventional model inversion methods are not designed to utilize supplementary information from the target dataset, and thus, cannot leverage it to improve performance, even when it is available. In this paper, we consider condensed samples, as a form of supplementary information, and introduce a method for using them to better approximate the target data distribution, thereby enhancing the KD performance. Our approach is versatile, evidenced by improvements of up to 11.4% in KD accuracy across various datasets and model inversion-based methods. Importantly, it remains effective even when using as few as one condensed sample per class, and can also enhance performance in few-shot scenarios where only limited real data samples are available.
Author	Mitra, Tulika Acar, Cihan Lee, Gim Hee Pham, Nam Trung Binici, Kuluhan Leman, Karianto Aggarwal, Shivam
Author_xml	– sequence: 1 givenname: Kuluhan surname: Binici fullname: Binici, Kuluhan – sequence: 2 givenname: Shivam surname: Aggarwal fullname: Aggarwal, Shivam – sequence: 3 givenname: Cihan surname: Acar fullname: Acar, Cihan – sequence: 4 givenname: Nam Trung surname: Pham fullname: Pham, Nam Trung – sequence: 5 givenname: Karianto surname: Leman fullname: Leman, Karianto – sequence: 6 givenname: Gim Hee surname: Lee fullname: Lee, Gim Hee – sequence: 7 givenname: Tulika surname: Mitra fullname: Mitra, Tulika
BackLink	https://doi.org/10.48550/arXiv.2408.13850$$DView paper in arXiv
BookMark	eNrjYmDJy89LZWCQNDTQM7EwNTXQTyyqyCzTMzIxsNAzNLYwNeBkcHLOz0tJzStOTVEITswtyEnVdS_NTAHyfPNTUnMUPPPKUouKM_PzFNLyixS88_LLc1JT0lMVXDKLSzJzchJLgFI8DKxpiTnFqbxQmptB3s01xNlDF2xbfEFRZm5iUWU8yNZ4sK3GhFUAALEfOUo
ContentType	Journal Article
Copyright	http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml	– notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID	AKY GOX
DOI	10.48550/arxiv.2408.13850
DatabaseName	arXiv Computer Science arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2408_13850
GroupedDBID	AKY GOX
ID	FETCH-arxiv_primary_2408_138503
IEDL.DBID	GOX
IngestDate	Wed Aug 28 12:10:13 EDT 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-arxiv_primary_2408_138503
OpenAccessLink	https://arxiv.org/abs/2408.13850
ParticipantIDs	arxiv_primary_2408_13850
PublicationCentury	2000
PublicationDate	2024-08-25
PublicationDateYYYYMMDD	2024-08-25
PublicationDate_xml	– month: 08 year: 2024 text: 2024-08-25 day: 25
PublicationDecade	2020
PublicationYear	2024
Score	3.872713
SecondaryResourceType	preprint
Snippet	Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Artificial Intelligence Computer Science - Learning
Title	Condensed Sample-Guided Model Inversion for Knowledge Distillation
URI	https://arxiv.org/abs/2408.13850
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQASaZxNQks2TdpCTQNKNpYqquZaKhkW6aIaj1bpFqlAq-69DXz8wj1MQrwjSCiUEBthcmsagiswxyPnBSsT7o_C09Q2MLUKec2cgItGTL3T8CMjkJPooLqh6hDtjGBAshVRJuggz80NadgiMkOoQYmFLzRBicnPNB18wWp6YoBCeCDuPVdS_NTAHyQPeQ5SiATroAj1kpANuPCt6wMS4FF1Duy4EsVRNlkHdzDXH20AXbGl8AOSIiHuSgeLCDjMUYWIAd-VQJBgVzg8REwxQLUJMgzSTVwNgyLTEl0TzNJNHCzCzNwtJUkkEClylSuKWkGbiMgBUtaJzTyFSGgaWkqDRVFlhRliTJgUMLAGqAbSg
link.rule.ids	228,230,786,891
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Condensed+Sample-Guided+Model+Inversion+for+Knowledge+Distillation&rft.au=Binici%2C+Kuluhan&rft.au=Aggarwal%2C+Shivam&rft.au=Acar%2C+Cihan&rft.au=Pham%2C+Nam+Trung&rft.date=2024-08-25&rft_id=info:doi/10.48550%2Farxiv.2408.13850&rft.externalDocID=2408_13850