Condensed Sample-Guided Model Inversion for Knowledge Distillation

Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issue...

Full description

Saved in:
Bibliographic Details
Main Authors Binici, Kuluhan, Aggarwal, Shivam, Acar, Cihan, Pham, Nam Trung, Leman, Karianto, Lee, Gim Hee, Mitra, Tulika
Format Journal Article
LanguageEnglish
Published 25.08.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, generated through model inversion, to mimic the target data distribution. However, conventional model inversion methods are not designed to utilize supplementary information from the target dataset, and thus, cannot leverage it to improve performance, even when it is available. In this paper, we consider condensed samples, as a form of supplementary information, and introduce a method for using them to better approximate the target data distribution, thereby enhancing the KD performance. Our approach is versatile, evidenced by improvements of up to 11.4% in KD accuracy across various datasets and model inversion-based methods. Importantly, it remains effective even when using as few as one condensed sample per class, and can also enhance performance in few-shot scenarios where only limited real data samples are available.
AbstractList Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, generated through model inversion, to mimic the target data distribution. However, conventional model inversion methods are not designed to utilize supplementary information from the target dataset, and thus, cannot leverage it to improve performance, even when it is available. In this paper, we consider condensed samples, as a form of supplementary information, and introduce a method for using them to better approximate the target data distribution, thereby enhancing the KD performance. Our approach is versatile, evidenced by improvements of up to 11.4% in KD accuracy across various datasets and model inversion-based methods. Importantly, it remains effective even when using as few as one condensed sample per class, and can also enhance performance in few-shot scenarios where only limited real data samples are available.
Author Mitra, Tulika
Acar, Cihan
Lee, Gim Hee
Pham, Nam Trung
Binici, Kuluhan
Leman, Karianto
Aggarwal, Shivam
Author_xml – sequence: 1
  givenname: Kuluhan
  surname: Binici
  fullname: Binici, Kuluhan
– sequence: 2
  givenname: Shivam
  surname: Aggarwal
  fullname: Aggarwal, Shivam
– sequence: 3
  givenname: Cihan
  surname: Acar
  fullname: Acar, Cihan
– sequence: 4
  givenname: Nam Trung
  surname: Pham
  fullname: Pham, Nam Trung
– sequence: 5
  givenname: Karianto
  surname: Leman
  fullname: Leman, Karianto
– sequence: 6
  givenname: Gim Hee
  surname: Lee
  fullname: Lee, Gim Hee
– sequence: 7
  givenname: Tulika
  surname: Mitra
  fullname: Mitra, Tulika
BackLink https://doi.org/10.48550/arXiv.2408.13850$$DView paper in arXiv
BookMark eNrjYmDJy89LZWCQNDTQM7EwNTXQTyyqyCzTMzIxsNAzNLYwNeBkcHLOz0tJzStOTVEITswtyEnVdS_NTAHyfPNTUnMUPPPKUouKM_PzFNLyixS88_LLc1JT0lMVXDKLSzJzchJLgFI8DKxpiTnFqbxQmptB3s01xNlDF2xbfEFRZm5iUWU8yNZ4sK3GhFUAALEfOUo
ContentType Journal Article
Copyright http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID AKY
GOX
DOI 10.48550/arxiv.2408.13850
DatabaseName arXiv Computer Science
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2408_13850
GroupedDBID AKY
GOX
ID FETCH-arxiv_primary_2408_138503
IEDL.DBID GOX
IngestDate Wed Aug 28 12:10:13 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-arxiv_primary_2408_138503
OpenAccessLink https://arxiv.org/abs/2408.13850
ParticipantIDs arxiv_primary_2408_13850
PublicationCentury 2000
PublicationDate 2024-08-25
PublicationDateYYYYMMDD 2024-08-25
PublicationDate_xml – month: 08
  year: 2024
  text: 2024-08-25
  day: 25
PublicationDecade 2020
PublicationYear 2024
Score 3.872713
SecondaryResourceType preprint
Snippet Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Artificial Intelligence
Computer Science - Learning
Title Condensed Sample-Guided Model Inversion for Knowledge Distillation
URI https://arxiv.org/abs/2408.13850
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwY2BQASaZxNQks2TdpCTQNKNpYqquZaKhkW6aIaj1bpFqlAq-69DXz8wj1MQrwjSCiUEBthcmsagiswxyPnBSsT7o_C09Q2MLUKec2cgItGTL3T8CMjkJPooLqh6hDtjGBAshVRJuggz80NadgiMkOoQYmFLzRBicnPNB18wWp6YoBCeCDuPVdS_NTAHyQPeQ5SiATroAj1kpANuPCt6wMS4FF1Duy4EsVRNlkHdzDXH20AXbGl8AOSIiHuSgeLCDjMUYWIAd-VQJBgVzg8REwxQLUJMgzSTVwNgyLTEl0TzNJNHCzCzNwtJUkkEClylSuKWkGbiMgBUtaJzTyFSGgaWkqDRVFlhRliTJgUMLAGqAbSg
link.rule.ids 228,230,786,891
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Condensed+Sample-Guided+Model+Inversion+for+Knowledge+Distillation&rft.au=Binici%2C+Kuluhan&rft.au=Aggarwal%2C+Shivam&rft.au=Acar%2C+Cihan&rft.au=Pham%2C+Nam+Trung&rft.date=2024-08-25&rft_id=info:doi/10.48550%2Farxiv.2408.13850&rft.externalDocID=2408_13850