Detecting All-to-One Backdoor Attacks in Black-Box DNNs via Differential Robustness to Noise
The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a white-box context, necessitating access to the backdoored model's architecture, hidden layer outputs, or internal parameters. The necessity for...
Saved in:
Published in | IEEE access Vol. 13; pp. 36099 - 36111 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
IEEE
2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a white-box context, necessitating access to the backdoored model's architecture, hidden layer outputs, or internal parameters. The necessity for black-box A2O backdoor defenses arises, particularly in scenarios where only the network's input and output are accessible. However, prevalent black-box A2O backdoor defenses often mandate assumptions regarding the locations of triggers, as they leverage hand-crafted features for detection. In instances where triggers deviate from these assumptions, the resultant hand-crafted features diminish in quality, rendering these methods ineffective. To address this issue, this work proposes a post-training black-box A2O backdoor defense that maintains consistent efficacy regardless of the triggers' locations. Our method hinges on the empirical observation that, in the context of A2O backdoor attacks, poisoned samples are more resilient to uniform noise than clean samples in terms of the network output. Specifically, our approach uses a metric to quantify the resiliency of the given input to the uniform noise. A novelty detector, trained utilizing the quantified resiliency of available clean samples, is deployed to discern whether the given input is poisoned. The novelty detector is evaluated across various triggers. Our approach is effective on all utilized triggers. Lastly, an explanation is provided for our observation. |
---|---|
AbstractList | The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a white-box context, necessitating access to the backdoored model's architecture, hidden layer outputs, or internal parameters. The necessity for black-box A2O backdoor defenses arises, particularly in scenarios where only the network's input and output are accessible. However, prevalent black-box A2O backdoor defenses often mandate assumptions regarding the locations of triggers, as they leverage hand-crafted features for detection. In instances where triggers deviate from these assumptions, the resultant hand-crafted features diminish in quality, rendering these methods ineffective. To address this issue, this work proposes a post-training black-box A2O backdoor defense that maintains consistent efficacy regardless of the triggers' locations. Our method hinges on the empirical observation that, in the context of A2O backdoor attacks, poisoned samples are more resilient to uniform noise than clean samples in terms of the network output. Specifically, our approach uses a metric to quantify the resiliency of the given input to the uniform noise. A novelty detector, trained utilizing the quantified resiliency of available clean samples, is deployed to discern whether the given input is poisoned. The novelty detector is evaluated across various triggers. Our approach is effective on all utilized triggers. Lastly, an explanation is provided for our observation. |
Author | Khorrami, Farshad Fu, Hao Krishnamurthy, Prashanth Garg, Siddharth |
Author_xml | – sequence: 1 givenname: Hao orcidid: 0000-0002-8282-6580 surname: Fu fullname: Fu, Hao email: hf881@nyu.edu organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA – sequence: 2 givenname: Prashanth orcidid: 0000-0001-8264-7972 surname: Krishnamurthy fullname: Krishnamurthy, Prashanth organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA – sequence: 3 givenname: Siddharth orcidid: 0000-0002-6158-9512 surname: Garg fullname: Garg, Siddharth organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA – sequence: 4 givenname: Farshad orcidid: 0000-0002-8418-004X surname: Khorrami fullname: Khorrami, Farshad organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA |
BookMark | eNpNUNtKxDAUDKLgevkCfcgPdM21l8du1xvILrj6JoQ0PZGstZEmiv69WSvivJxhhhkOc4T2Bz8AQmeUzCkl1UXdNJebzZwRJudcCp6wh2aM5lXGJc_3__FDdBrCliSUSZLFDD0tIYKJbnjGdd9n0WfrAfBCm5fO-xHXMSYasBvwok8sW_hPvFytAv5wGi-dtTDCEJ3u8b1v30McIAQcPV55F-AEHVjdBzj9vcfo8eryobnJ7tbXt019lxmW05hJqzmRVaV1S6EkhTEdCEM7VvGWFi0HyQom5M7tmLAkOYK2ptRESEmp4cfodurtvN6qt9G96vFLee3Uj-DHZ6XH6EwPqrOdlYWgxlImQJYtyaWFvGSiqNI3beriU5cZfQgj2L8-StRubzXtrXZ7q9-9U-p8SjkA-JcoK1rIin8D7LB9OQ |
CODEN | IAECCG |
Cites_doi | 10.1109/TDSC.2023.3263507 10.23919/DATE48585.2020.9116489 10.1016/j.neunet.2012.02.016 10.1109/TBDATA.2019.2921572 10.1109/SP54263.2024.00015 10.1145/3474369.3486874 10.1109/ACCESS.2023.3245570 10.1109/CVPR.2014.244 10.1109/ICCV48922.2021.01617 10.1109/ICCV48922.2021.01615 10.1109/ACCESS.2022.3207839 10.1109/CVPR.2011.5995566 10.1145/3359789.3359790 10.1109/ICDM.2008.17 10.1109/TDSC.2020.3028448 10.1162/089976601750264965 10.1109/ACCESS.2020.3032411 10.1109/ACCESS.2019.2909068 10.1109/ACCESS.2022.3141077 10.1109/CCTA41146.2020.9206312 10.1145/3319535.3363216 10.1109/CVPR.2016.90 10.1145/3658644.3670361 10.1007/978-3-031-20065-6_11 10.1609/aaai.v36i9.21191 10.1145/342009.335388 10.1109/ICCV.2017.74 10.1109/SPW50608.2020.00025 10.1145/3427228.3427264 10.1109/CVPR.2009.5206848 10.1016/j.cose.2021.102277 10.29172/7c2a6982-6d72-4cd8-bba6-2fccb06a7011 10.1109/TIFS.2023.3297056 10.1007/978-3-030-00470-5_13 10.1109/SP.2019.00031 10.21314/JCF.2023.003 10.14722/ndss.2018.23291 10.1109/ICCV48922.2021.01175 10.1109/CVPR52688.2022.01465 10.1145/377939.377946 |
ContentType | Journal Article |
DBID | 97E ESBDL RIA RIE AAYXX CITATION DOA |
DOI | 10.1109/ACCESS.2025.3543333 |
DatabaseName | IEEE Xplore (IEEE) IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 36111 |
ExternalDocumentID | oai_doaj_org_article_dfdf5741cf124e58b065fe682479c26b 10_1109_ACCESS_2025_3543333 10891759 |
Genre | orig-research |
GrantInformation_xml | – fundername: Tamkeen under the NYUAD Research Institute Award grantid: CG010 – fundername: New York University Abu Dhabi (NYUAD) Center for Artificial Intelligence and Robotics (CAIR) – fundername: Army Research Office grantid: W911NF-21-1-0155 funderid: 10.13039/100000183 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ABAZT ABVLG ACGFS ADBBV AGSQL ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RNS AAYXX CITATION RIG |
ID | FETCH-LOGICAL-c261t-5fa30599aab1e807ccde4c1d293b17b3e527245ab1ed24f01d241bc8a045511c3 |
IEDL.DBID | DOA |
ISSN | 2169-3536 |
IngestDate | Wed Aug 27 01:30:56 EDT 2025 Tue Jul 01 05:26:37 EDT 2025 Wed Aug 27 01:46:39 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | https://creativecommons.org/licenses/by/4.0/legalcode |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c261t-5fa30599aab1e807ccde4c1d293b17b3e527245ab1ed24f01d241bc8a045511c3 |
ORCID | 0000-0002-8282-6580 0000-0002-8418-004X 0000-0002-6158-9512 0000-0001-8264-7972 |
OpenAccessLink | https://doaj.org/article/dfdf5741cf124e58b065fe682479c26b |
PageCount | 13 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_dfdf5741cf124e58b065fe682479c26b ieee_primary_10891759 crossref_primary_10_1109_ACCESS_2025_3543333 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 20250000 2025-00-00 2025-01-01 |
PublicationDateYYYYMMDD | 2025-01-01 |
PublicationDate_xml | – year: 2025 text: 20250000 |
PublicationDecade | 2020 |
PublicationTitle | IEEE access |
PublicationTitleAbbrev | Access |
PublicationYear | 2025 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
References | ref13 Lin (ref53) ref57 ref12 ref56 LeCun (ref43) 2010 ref14 ref58 Qiao (ref25); 32 ref52 ref11 ref10 ref54 ref17 Zhang (ref31) ref16 ref19 ref18 Nguyen (ref23) ref51 ref46 ref48 ref47 ref42 ref41 ref44 Guo (ref15) ref49 ref8 ref7 Zhao (ref2) 2024 ref9 ref6 ref5 ref40 Nguyen (ref20) ref35 ref34 ref33 ref32 Patel (ref1) ref39 ref38 Krizhevsky (ref45) 2009 Chen (ref50) 2017 Xie (ref30) Kim (ref3) 2024 Cohen (ref37) ref24 Hong (ref21) ref26 ref22 Zhao (ref4) 2024 ref28 ref27 ref29 Zhang (ref36) 2022 Platt (ref55) 1999; 10 |
References_xml | – ident: ref8 doi: 10.1109/TDSC.2023.3263507 – volume: 10 start-page: 61 issue: 3 year: 1999 ident: ref55 article-title: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods publication-title: Adv. Large Margin Classifiers – ident: ref49 doi: 10.23919/DATE48585.2020.9116489 – ident: ref44 doi: 10.1016/j.neunet.2012.02.016 – ident: ref39 doi: 10.1109/TBDATA.2019.2921572 – year: 2024 ident: ref4 article-title: PolyModel for Hedge Funds’ portfolio construction using machine learning publication-title: arXiv:2412.11019 – ident: ref57 doi: 10.1109/SP54263.2024.00015 – ident: ref51 doi: 10.1145/3474369.3486874 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref53 article-title: Network in network – start-page: 8068 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref21 article-title: Handcrafted backdoors in deep neural networks – ident: ref6 doi: 10.1109/ACCESS.2023.3245570 – start-page: 1310 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref37 article-title: Certified adversarial robustness via randomized smoothing – ident: ref54 doi: 10.1109/CVPR.2014.244 – start-page: 11372 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref30 article-title: CRFL: Certifiably robust federated learning against backdoor attacks – ident: ref16 doi: 10.1109/ICCV48922.2021.01617 – year: 2024 ident: ref3 article-title: Face-GPS: A comprehensive technique for quantifying facial muscle dynamics in videos publication-title: arXiv:2401.05625 – ident: ref19 doi: 10.1109/ICCV48922.2021.01615 – year: 2017 ident: ref50 article-title: Targeted backdoor attacks on deep learning systems using data poisoning publication-title: arXiv:1712.05526 – ident: ref35 doi: 10.1109/ACCESS.2022.3207839 – ident: ref46 doi: 10.1109/CVPR.2011.5995566 – ident: ref17 doi: 10.1145/3359789.3359790 – ident: ref41 doi: 10.1109/ICDM.2008.17 – year: 2009 ident: ref45 article-title: Learning multiple layers of features from tiny images – ident: ref22 doi: 10.1109/TDSC.2020.3028448 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref23 article-title: WaNet–imperceptible warping-based backdoor attack – ident: ref40 doi: 10.1162/089976601750264965 – ident: ref48 doi: 10.1109/ACCESS.2020.3032411 – ident: ref9 doi: 10.1109/ACCESS.2019.2909068 – ident: ref27 doi: 10.1109/ACCESS.2022.3141077 – ident: ref5 doi: 10.1109/CCTA41146.2020.9206312 – ident: ref13 doi: 10.1145/3319535.3363216 – start-page: 31474 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref31 article-title: BagFlip: A certified defense against data poisoning – year: 2024 ident: ref2 article-title: Hedge fund portfolio construction using PolyModel theory and iTransformer publication-title: arXiv:2408.03320 – ident: ref52 doi: 10.1109/CVPR.2016.90 – ident: ref58 doi: 10.1145/3658644.3670361 – ident: ref26 doi: 10.1007/978-3-031-20065-6_11 – ident: ref29 doi: 10.1609/aaai.v36i9.21191 – ident: ref42 doi: 10.1145/342009.335388 – ident: ref34 doi: 10.1109/ICCV.2017.74 – year: 2010 ident: ref43 publication-title: MNIST Handwritten Digit Database – ident: ref33 doi: 10.1109/SPW50608.2020.00025 – start-page: 3454 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref20 article-title: Input-aware dynamic backdoor attack – volume: 32 start-page: 14004 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref25 article-title: Defending neural backdoors via generative distribution modeling – ident: ref32 doi: 10.1145/3427228.3427264 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref15 article-title: AEVA: Black-box backdoor detection using adversarial extreme value analysis – ident: ref47 doi: 10.1109/CVPR.2009.5206848 – ident: ref14 doi: 10.1016/j.cose.2021.102277 – ident: ref56 doi: 10.29172/7c2a6982-6d72-4cd8-bba6-2fccb06a7011 – ident: ref18 doi: 10.1109/TIFS.2023.3297056 – ident: ref12 doi: 10.1007/978-3-030-00470-5_13 – ident: ref24 doi: 10.1109/SP.2019.00031 – ident: ref7 doi: 10.21314/JCF.2023.003 – ident: ref10 doi: 10.14722/ndss.2018.23291 – ident: ref11 doi: 10.1109/ICCV48922.2021.01175 – ident: ref28 doi: 10.1109/CVPR52688.2022.01465 – year: 2022 ident: ref36 article-title: Adversarial samples for deep monocular 6D object pose estimation publication-title: arXiv:2203.00302 – start-page: 1 volume-title: Proc. NeurIPS Workshop Dataset Curation Secur. ident: ref1 article-title: Bait and switch: Online training data poisoning of autonomous driving systems – ident: ref38 doi: 10.1145/377939.377946 |
SSID | ssj0000816957 |
Score | 2.3344336 |
Snippet | The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a... |
SourceID | doaj crossref ieee |
SourceType | Open Website Index Database Publisher |
StartPage | 36099 |
SubjectTerms | Closed box Detectors Glass box Neural network backdoors Neural networks Noise novelty detection output resiliency Perturbation methods Resilience Robustness Strips Training |
SummonAdditionalLinks | – databaseName: IEEE Electronic Library (IEL) dbid: RIE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1La9wwEBbZnJJD0jQJ2SQNOvQYb21LWtvHfTSEHlwoXcihYPQYwbKLFbLaUvrrO5KdkC0EchOSwZJGj29mNN8Q8rmyNtfClAlefyzhWS5xS5XBUyhVXhVK2I7tsx7fL_i3B_HQB6vHWBgAiI_PYBSK0ZdvnN4GUxnu8BK1C1ENyAA1ty5Y68WgEjJIVKLomYWytPoymc1wEKgD5mLEBGeMsZ3bJ5L072RViZfK3TGpn7vTvSVZjbZejfTf_5ga393fD-Soh5d00q2HE7IH7Udy-Ip08JT8mkNwHGCZTtbrxLvkewt0KvXKOPdEJ96HsHu6bGk07iVT94fO63pDfy8lnff5VPBcWNMfTm03PpyV1Dtau-UGzsji7uvP2X3S51hINOpOPhFWskDRIqXKoEwLrQ1wnRlEASorFAORFzkXodXk3KbYwjOlS4lQELGaZudkv3UtXBAKWKnTTKdVqbgAUFJZpcdGhTAhxAFDcvs8981jR6XRRBUkrZpOVE0QVdOLakimQT4vnwYe7FiBU9z026ox1liBoEhbxCkgSoWIysK4zHlR4QDVkJwFsbz6XyeRyzfqr8hB6ENnY7km-_5pC58QdXh1E1fbP0ZS06E priority: 102 providerName: IEEE |
Title | Detecting All-to-One Backdoor Attacks in Black-Box DNNs via Differential Robustness to Noise |
URI | https://ieeexplore.ieee.org/document/10891759 https://doaj.org/article/dfdf5741cf124e58b065fe682479c26b |
Volume | 13 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELZQJxgQjyLKo_LASGj8SpwxbakqhiIhKnVAimzHliqqBLUp4udzTgJqJxa2yI4S-y72d3fOfYfQXeIcNSKXAcAfCzihCpaU9CeFStMk1sI1bJ-zaDrnTwux2Cn15f8Ja-iBG8ENcpc7AbBnHCCRFVIDZjobScrjxNBI-90XMG_Hmar3YEmiRMQtzRAJk0E6GsGMwCGk4oEJzhhje1BUM_bvlVipEWZygo5b0xCnzZBO0YEtztDRDmHgOXobWx_0h2ucrlZBVQbPhcVDZd7zslzjtKp8yjxeFrgOzAXD8guPZ7MN_lwqPG5rocCaXuGXUm83ld_ncFXiWbnc2C6aTx5fR9OgrY8QwMRJFQinmKdXUUoTK8PYmNxyQ3JAcE1izaygMeXC9-aUuxB6ONFGKjDjwM4y7AJ1irKwlwhbaDQhMWEiNRfWaqWdNlGufYoPYHgP3f-IKvtoaDCy2n0Ik6yRbOYlm7WS7aGhF-fvrZ7Dum4AzWatZrO_NNtDXa-MnfdJ8C1FcvUfD79Gh37ATTDlBnWq9dbegnlR6X79JfXrTMBvR_nKrA |
linkProvider | Directory of Open Access Journals |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1baxQxFA5aH9QHry2utZoHH511Mkl2Zh730rJqHUFa6IMw5HICS5eJdLNS-ut7kpmWVhB8C8nAJPmSnEtyvkPIx9q5wkhbZSj-eCZYoXBLVfGmUOmiLrV0PdtnM1meiq9n8mwIVk-xMACQHp_BOBbTXb71ZhtdZbjDK7QuZP2QPELBL1kfrnXrUok5JGpZDtxCLK8_T-dzHAZagYUccyk45_ye_Ek0_ffyqiSxcvScNDcd6l-TnI-3QY_N1V9cjf_d4xfk2aBg0mm_Il6SB9C9Ik_v0A6-Jr8WEK8OsEyn63UWfPajAzpT5tx6f0GnIcTAe7rqaHLvZTN_SRdNs6F_VoouhowqeDKs6U-vt5sQT0saPG38agO75PTo8GS-zIYsC5lB6ylk0ikeSVqU0gyqvDTGgjDMoh6gWak5yKIshIytthAuxxbBtKlUHiFghu-Rnc538IZQwEqTM5PXlRYSQCvttJlYHQOFUBMYkU83c9_-7sk02mSE5HXbQ9VGqNoBqhGZRXxuP41M2KkCp7gdNlZrnXUS1SLjUFMBWWnUqRxMqkKUNQ5Qj8huhOXO_3pE3v6j_gN5vDz5ftwef2m-7ZMnsT-9x-Ud2QkXWzhAHSTo92nlXQN6Q9bq |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Detecting+All-to-One+Backdoor+Attacks+in+Black-Box+DNNs+via+Differential+Robustness+to+Noise&rft.jtitle=IEEE+access&rft.au=Fu%2C+Hao&rft.au=Krishnamurthy%2C+Prashanth&rft.au=Garg%2C+Siddharth&rft.au=Khorrami%2C+Farshad&rft.date=2025&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=13&rft.spage=36099&rft.epage=36111&rft_id=info:doi/10.1109%2FACCESS.2025.3543333&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2025_3543333 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |