Reconstructing Training Data from Multiclass Neural Networks

Reconstructing samples from the training set of trained neural networks is a major privacy concern. Haim et al. (2022) recently showed that it is possible to reconstruct training samples from neural network binary classifiers, based on theoretical results about the implicit bias of gradient methods....

Full description

Saved in:

Bibliographic Details
Main Authors	Buzaglo, Gon, Haim, Niv, Yehudai, Gilad, Vardi, Gal, Irani, Michal
Format	Journal Article
Language	English
Published	05.05.2023
Subjects	Computer Science - Computer Vision and Pattern Recognition Computer Science - Cryptography and Security Computer Science - Learning
Online Access	Get full text

Cover

Loading…

Abstract	Reconstructing samples from the training set of trained neural networks is a major privacy concern. Haim et al. (2022) recently showed that it is possible to reconstruct training samples from neural network binary classifiers, based on theoretical results about the implicit bias of gradient methods. In this work, we present several improvements and new insights over this previous work. As our main improvement, we show that training-data reconstruction is possible in the multi-class setting and that the reconstruction quality is even higher than in the case of binary classification. Moreover, we show that using weight-decay during training increases the vulnerability to sample reconstruction. Finally, while in the previous work the training set was of size at most $1000$ from $10$ classes, we show preliminary evidence of the ability to reconstruct from a model trained on $5000$ samples from $100$ classes.
AbstractList	Reconstructing samples from the training set of trained neural networks is a major privacy concern. Haim et al. (2022) recently showed that it is possible to reconstruct training samples from neural network binary classifiers, based on theoretical results about the implicit bias of gradient methods. In this work, we present several improvements and new insights over this previous work. As our main improvement, we show that training-data reconstruction is possible in the multi-class setting and that the reconstruction quality is even higher than in the case of binary classification. Moreover, we show that using weight-decay during training increases the vulnerability to sample reconstruction. Finally, while in the previous work the training set was of size at most $1000$ from $10$ classes, we show preliminary evidence of the ability to reconstruct from a model trained on $5000$ samples from $100$ classes.
Author	Yehudai, Gilad Haim, Niv Vardi, Gal Irani, Michal Buzaglo, Gon
Author_xml	– sequence: 1 givenname: Gon surname: Buzaglo fullname: Buzaglo, Gon – sequence: 2 givenname: Niv surname: Haim fullname: Haim, Niv – sequence: 3 givenname: Gilad surname: Yehudai fullname: Yehudai, Gilad – sequence: 4 givenname: Gal surname: Vardi fullname: Vardi, Gal – sequence: 5 givenname: Michal surname: Irani fullname: Irani, Michal
BackLink	https://doi.org/10.48550/arXiv.2305.03350$$DView paper in arXiv
BookMark	eNotj8tOwzAUBb2ABS18ACvyAwmO7WvHEhtUXpUKSCj76NqJK4vUQbbD4--hpas5q6OZBTkJUxgIuaxpJRoAeo3x239WjFOoKOdAz8jN22CnkHKcbfZhW7QRfdiPO8xYuDjtiud5zN6OmFLxMswRxz_krym-p3Ny6nBMw8WRS9I-3Lerp3Lz-rhe3W5KlIqWUlqHsgfuBIMGGWda99SJmjGjOSigVPeonKp7YxoQElVtFIJ01mgpGr4kV_-3B_3uI_odxp9un9EdMvgvfKVD0Q
ContentType	Journal Article
Copyright	http://creativecommons.org/licenses/by/4.0
Copyright_xml	– notice: http://creativecommons.org/licenses/by/4.0
DBID	AKY GOX
DOI	10.48550/arxiv.2305.03350
DatabaseName	arXiv Computer Science arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	2305_03350
GroupedDBID	AKY GOX
ID	FETCH-LOGICAL-a670-66cfa6d53f4258a23299d0f4122b93575009da7f71dbb8546a71b7a56fcb96483
IEDL.DBID	GOX
IngestDate	Mon Jan 08 05:41:29 EST 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a670-66cfa6d53f4258a23299d0f4122b93575009da7f71dbb8546a71b7a56fcb96483
OpenAccessLink	https://arxiv.org/abs/2305.03350
ParticipantIDs	arxiv_primary_2305_03350
PublicationCentury	2000
PublicationDate	2023-05-05
PublicationDateYYYYMMDD	2023-05-05
PublicationDate_xml	– month: 05 year: 2023 text: 2023-05-05 day: 05
PublicationDecade	2020
PublicationYear	2023
Score	1.8785983
SecondaryResourceType	preprint
Snippet	Reconstructing samples from the training set of trained neural networks is a major privacy concern. Haim et al. (2022) recently showed that it is possible to...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Computer Vision and Pattern Recognition Computer Science - Cryptography and Security Computer Science - Learning
Title	Reconstructing Training Data from Multiclass Neural Networks
URI	https://arxiv.org/abs/2305.03350
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1NSwMxEB1qT15EUamf5OB1dTefG_Aiai2C9bLC3sok2YiXIraIP7-T7Ba9eE2GQCYMeY-ZeQNw5VA7bojkYBSRCIq3BYFYWUgMpoy-rDqVGpxf5nr2Jp9b1Y6AbXth8Ovn47vXB3arG8LH6roUIpHyHc5TydbTa9snJ7MU12D_a0cYMy_9-SSm-7A3oDt21z_HAYy65SHcJoq3FWpdvrNmmMrAHnCNLPV3sNwG6xOQZUktg46Y9-XZqyNopo_N_awYhhYUqA0xMe0j6qBEpGCokfCKtaGMsuLcWUHYiDBNQBNNFZyrldRoKmdQ6eid1bIWxzAm3t9NgHnukctYy2gFRZpzIviUBEyKY6JT9gQm-aqLz16XYpG8sMheOP1_6wx208T0XLOnzmFMl-8u6F9du8vs3A1eiHdL
link.rule.ids	228,230,783,888
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reconstructing+Training+Data+from+Multiclass+Neural+Networks&rft.au=Buzaglo%2C+Gon&rft.au=Haim%2C+Niv&rft.au=Yehudai%2C+Gilad&rft.au=Vardi%2C+Gal&rft.date=2023-05-05&rft_id=info:doi/10.48550%2Farxiv.2305.03350&rft.externalDocID=2305_03350