On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection
Detecting adversarial samples that are carefully crafted to fool the model is a critical step to socially-secure applications. However, existing adversarial detection methods require access to sufficient training data, which brings noteworthy concerns regarding privacy leakage and generalizability....
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
26.06.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Detecting adversarial samples that are carefully crafted to fool the model is
a critical step to socially-secure applications. However, existing adversarial
detection methods require access to sufficient training data, which brings
noteworthy concerns regarding privacy leakage and generalizability. In this
work, we validate that the adversarial sample generated by attack algorithms is
strongly related to a specific vector in the high-dimensional inputs. Such
vectors, namely UAPs (Universal Adversarial Perturbations), can be calculated
without original training data. Based on this discovery, we propose a
data-agnostic adversarial detection framework, which induces different
responses between normal and adversarial samples to UAPs. Experimental results
show that our method achieves competitive detection performance on various text
classification tasks, and maintains an equivalent time consumption to normal
inference. |
---|---|
AbstractList | Detecting adversarial samples that are carefully crafted to fool the model is
a critical step to socially-secure applications. However, existing adversarial
detection methods require access to sufficient training data, which brings
noteworthy concerns regarding privacy leakage and generalizability. In this
work, we validate that the adversarial sample generated by attack algorithms is
strongly related to a specific vector in the high-dimensional inputs. Such
vectors, namely UAPs (Universal Adversarial Perturbations), can be calculated
without original training data. Based on this discovery, we propose a
data-agnostic adversarial detection framework, which induces different
responses between normal and adversarial samples to UAPs. Experimental results
show that our method achieves competitive detection performance on various text
classification tasks, and maintains an equivalent time consumption to normal
inference. |
Author | Shan, Ying Dou, Shihan Zhang, Qi Gao, Songyang Huang, Xuanjing Ma, Jin |
Author_xml | – sequence: 1 givenname: Songyang surname: Gao fullname: Gao, Songyang – sequence: 2 givenname: Shihan surname: Dou fullname: Dou, Shihan – sequence: 3 givenname: Qi surname: Zhang fullname: Zhang, Qi – sequence: 4 givenname: Xuanjing surname: Huang fullname: Huang, Xuanjing – sequence: 5 givenname: Jin surname: Ma fullname: Ma, Jin – sequence: 6 givenname: Ying surname: Shan fullname: Shan, Ying |
BackLink | https://doi.org/10.48550/arXiv.2306.15705$$DView paper in arXiv |
BookMark | eNpVj71OwzAUhT3AAIUHYMIvkODEdXIzVm35kSqVoUhs0bV9r7BUHOSYCt6eNrAwnW8450jfpTiLQyQhbipVzsEYdYfpKxzKWqumrEyrzIV43UaZ30i-xHCgNOJeLvwEKRz5mVL-TBZzGOIoeUhyzRxcoJjlCjMWnIj-LVaUyZ3qV-KccT_S9V_OxO5-vVs-Fpvtw9NysSmwaU1hfUeVpoq0U0wMHhrDXcseYN4BOu_JklEVOKUsaqj1sQ9Yewuusez1TNz-3k5q_UcK75i--5NiPynqH1EET94 |
ContentType | Journal Article |
Copyright | http://creativecommons.org/licenses/by-sa/4.0 |
Copyright_xml | – notice: http://creativecommons.org/licenses/by-sa/4.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2306.15705 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2306_15705 |
GroupedDBID | AKY GOX |
ID | FETCH-LOGICAL-a675-bd9e13e1e3c0fef8d865f97fd88498acddebe5018c00ba38239e18a2db8c6bfd3 |
IEDL.DBID | GOX |
IngestDate | Mon Jan 08 05:49:18 EST 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a675-bd9e13e1e3c0fef8d865f97fd88498acddebe5018c00ba38239e18a2db8c6bfd3 |
OpenAccessLink | https://arxiv.org/abs/2306.15705 |
ParticipantIDs | arxiv_primary_2306_15705 |
PublicationCentury | 2000 |
PublicationDate | 2023-06-26 |
PublicationDateYYYYMMDD | 2023-06-26 |
PublicationDate_xml | – month: 06 year: 2023 text: 2023-06-26 day: 26 |
PublicationDecade | 2020 |
PublicationYear | 2023 |
Score | 1.8899709 |
SecondaryResourceType | preprint |
Snippet | Detecting adversarial samples that are carefully crafted to fool the model is
a critical step to socially-secure applications. However, existing adversarial... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Computation and Language Computer Science - Learning |
Title | On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection |
URI | https://arxiv.org/abs/2306.15705 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV09T8QwDI2Om1gQCNDxqQysgX6kaToi7o4TEhzDIXWr7MSRWAoqBfHzSdIiYGCLYmd5luKn2Hlm7AJkkQEZHwGpCyENofB5HUQqAY2u0iyx4b3j_kGtnuRdXdQTxr__wkD3-fwx6APj21Xgx5dpUQaR0q0sCy1bt-t6KE5GKa7R_8fPc8y49StJLHfZzsju-PUQjj02oXaf1euWe5rFxx6IYLdxEWLPH6nztz4OD2fcU0i-iKoOPhnwOfQgXEf058Sc-tg_1R6wzXKxuVmJcaCBAM_LBdqK0pxSyk3iyGmrVeGq0lmtZaXB-JsGKQjsmSRBCAU6768hs6iNQmfzQzZtX1qaMW6Vo9KRQkSQOs9RUeKUx7VUFgtIjtgswtC8DpoVTUCoiQgd_286YdthmnrohMrUKZv23Tud-Zzb43kE_gvBWYRv |
link.rule.ids | 228,230,783,888 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=On+the+Universal+Adversarial+Perturbations+for+Efficient+Data-free+Adversarial+Detection&rft.au=Gao%2C+Songyang&rft.au=Dou%2C+Shihan&rft.au=Zhang%2C+Qi&rft.au=Huang%2C+Xuanjing&rft.date=2023-06-26&rft_id=info:doi/10.48550%2Farxiv.2306.15705&rft.externalDocID=2306_15705 |