Detecting All-to-One Backdoor Attacks in Black-Box DNNs via Differential Robustness to Noise

The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a white-box context, necessitating access to the backdoored model's architecture, hidden layer outputs, or internal parameters. The necessity for...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 13; pp. 36099 - 36111
Main Authors	Fu, Hao, Krishnamurthy, Prashanth, Garg, Siddharth, Khorrami, Farshad
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Closed box Detectors Glass box Neural network backdoors Neural networks Noise novelty detection output resiliency Perturbation methods Resilience Robustness Strips Training
Online Access	Get full text

Cover

Loading…

Abstract	The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a white-box context, necessitating access to the backdoored model's architecture, hidden layer outputs, or internal parameters. The necessity for black-box A2O backdoor defenses arises, particularly in scenarios where only the network's input and output are accessible. However, prevalent black-box A2O backdoor defenses often mandate assumptions regarding the locations of triggers, as they leverage hand-crafted features for detection. In instances where triggers deviate from these assumptions, the resultant hand-crafted features diminish in quality, rendering these methods ineffective. To address this issue, this work proposes a post-training black-box A2O backdoor defense that maintains consistent efficacy regardless of the triggers' locations. Our method hinges on the empirical observation that, in the context of A2O backdoor attacks, poisoned samples are more resilient to uniform noise than clean samples in terms of the network output. Specifically, our approach uses a metric to quantify the resiliency of the given input to the uniform noise. A novelty detector, trained utilizing the quantified resiliency of available clean samples, is deployed to discern whether the given input is poisoned. The novelty detector is evaluated across various triggers. Our approach is effective on all utilized triggers. Lastly, an explanation is provided for our observation.
AbstractList	The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a white-box context, necessitating access to the backdoored model's architecture, hidden layer outputs, or internal parameters. The necessity for black-box A2O backdoor defenses arises, particularly in scenarios where only the network's input and output are accessible. However, prevalent black-box A2O backdoor defenses often mandate assumptions regarding the locations of triggers, as they leverage hand-crafted features for detection. In instances where triggers deviate from these assumptions, the resultant hand-crafted features diminish in quality, rendering these methods ineffective. To address this issue, this work proposes a post-training black-box A2O backdoor defense that maintains consistent efficacy regardless of the triggers' locations. Our method hinges on the empirical observation that, in the context of A2O backdoor attacks, poisoned samples are more resilient to uniform noise than clean samples in terms of the network output. Specifically, our approach uses a metric to quantify the resiliency of the given input to the uniform noise. A novelty detector, trained utilizing the quantified resiliency of available clean samples, is deployed to discern whether the given input is poisoned. The novelty detector is evaluated across various triggers. Our approach is effective on all utilized triggers. Lastly, an explanation is provided for our observation.
Author	Khorrami, Farshad Fu, Hao Krishnamurthy, Prashanth Garg, Siddharth
Author_xml	– sequence: 1 givenname: Hao orcidid: 0000-0002-8282-6580 surname: Fu fullname: Fu, Hao email: hf881@nyu.edu organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA – sequence: 2 givenname: Prashanth orcidid: 0000-0001-8264-7972 surname: Krishnamurthy fullname: Krishnamurthy, Prashanth organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA – sequence: 3 givenname: Siddharth orcidid: 0000-0002-6158-9512 surname: Garg fullname: Garg, Siddharth organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA – sequence: 4 givenname: Farshad orcidid: 0000-0002-8418-004X surname: Khorrami fullname: Khorrami, Farshad organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA
BookMark	eNpNUNtKxDAUDKLgevkCfcgPdM21l8du1xvILrj6JoQ0PZGstZEmiv69WSvivJxhhhkOc4T2Bz8AQmeUzCkl1UXdNJebzZwRJudcCp6wh2aM5lXGJc_3__FDdBrCliSUSZLFDD0tIYKJbnjGdd9n0WfrAfBCm5fO-xHXMSYasBvwok8sW_hPvFytAv5wGi-dtTDCEJ3u8b1v30McIAQcPV55F-AEHVjdBzj9vcfo8eryobnJ7tbXt019lxmW05hJqzmRVaV1S6EkhTEdCEM7VvGWFi0HyQom5M7tmLAkOYK2ptRESEmp4cfodurtvN6qt9G96vFLee3Uj-DHZ6XH6EwPqrOdlYWgxlImQJYtyaWFvGSiqNI3beriU5cZfQgj2L8-StRubzXtrXZ7q9-9U-p8SjkA-JcoK1rIin8D7LB9OQ
CODEN	IAECCG
Cites_doi	10.1109/TDSC.2023.3263507 10.23919/DATE48585.2020.9116489 10.1016/j.neunet.2012.02.016 10.1109/TBDATA.2019.2921572 10.1109/SP54263.2024.00015 10.1145/3474369.3486874 10.1109/ACCESS.2023.3245570 10.1109/CVPR.2014.244 10.1109/ICCV48922.2021.01617 10.1109/ICCV48922.2021.01615 10.1109/ACCESS.2022.3207839 10.1109/CVPR.2011.5995566 10.1145/3359789.3359790 10.1109/ICDM.2008.17 10.1109/TDSC.2020.3028448 10.1162/089976601750264965 10.1109/ACCESS.2020.3032411 10.1109/ACCESS.2019.2909068 10.1109/ACCESS.2022.3141077 10.1109/CCTA41146.2020.9206312 10.1145/3319535.3363216 10.1109/CVPR.2016.90 10.1145/3658644.3670361 10.1007/978-3-031-20065-6_11 10.1609/aaai.v36i9.21191 10.1145/342009.335388 10.1109/ICCV.2017.74 10.1109/SPW50608.2020.00025 10.1145/3427228.3427264 10.1109/CVPR.2009.5206848 10.1016/j.cose.2021.102277 10.29172/7c2a6982-6d72-4cd8-bba6-2fccb06a7011 10.1109/TIFS.2023.3297056 10.1007/978-3-030-00470-5_13 10.1109/SP.2019.00031 10.21314/JCF.2023.003 10.14722/ndss.2018.23291 10.1109/ICCV48922.2021.01175 10.1109/CVPR52688.2022.01465 10.1145/377939.377946
ContentType	Journal Article
DBID	97E ESBDL RIA RIE AAYXX CITATION DOA
DOI	10.1109/ACCESS.2025.3543333
DatabaseName	IEEE Xplore (IEEE) IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef DOAJ Directory of Open Access Journals
DatabaseTitle	CrossRef
DatabaseTitleList
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	2169-3536
EndPage	36111
ExternalDocumentID	oai_doaj_org_article_dfdf5741cf124e58b065fe682479c26b 10_1109_ACCESS_2025_3543333 10891759
Genre	orig-research
GrantInformation_xml	– fundername: Tamkeen under the NYUAD Research Institute Award grantid: CG010 – fundername: New York University Abu Dhabi (NYUAD) Center for Artificial Intelligence and Robotics (CAIR) – fundername: Army Research Office grantid: W911NF-21-1-0155 funderid: 10.13039/100000183
GroupedDBID	0R~ 4.4 5VS 6IK 97E AAJGR ABAZT ABVLG ACGFS ADBBV AGSQL ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RNS AAYXX CITATION RIG
ID	FETCH-LOGICAL-c261t-5fa30599aab1e807ccde4c1d293b17b3e527245ab1ed24f01d241bc8a045511c3
IEDL.DBID	DOA
ISSN	2169-3536
IngestDate	Wed Aug 27 01:30:56 EDT 2025 Tue Jul 01 05:26:37 EDT 2025 Wed Aug 27 01:46:39 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Language	English
License	https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c261t-5fa30599aab1e807ccde4c1d293b17b3e527245ab1ed24f01d241bc8a045511c3
ORCID	0000-0002-8282-6580 0000-0002-8418-004X 0000-0002-6158-9512 0000-0001-8264-7972
OpenAccessLink	https://doaj.org/article/dfdf5741cf124e58b065fe682479c26b
PageCount	13
ParticipantIDs	doaj_primary_oai_doaj_org_article_dfdf5741cf124e58b065fe682479c26b ieee_primary_10891759 crossref_primary_10_1109_ACCESS_2025_3543333
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	20250000 2025-00-00 2025-01-01
PublicationDateYYYYMMDD	2025-01-01
PublicationDate_xml	– year: 2025 text: 20250000
PublicationDecade	2020
PublicationTitle	IEEE access
PublicationTitleAbbrev	Access
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
References	ref13 Lin (ref53) ref57 ref12 ref56 LeCun (ref43) 2010 ref14 ref58 Qiao (ref25); 32 ref52 ref11 ref10 ref54 ref17 Zhang (ref31) ref16 ref19 ref18 Nguyen (ref23) ref51 ref46 ref48 ref47 ref42 ref41 ref44 Guo (ref15) ref49 ref8 ref7 Zhao (ref2) 2024 ref9 ref6 ref5 ref40 Nguyen (ref20) ref35 ref34 ref33 ref32 Patel (ref1) ref39 ref38 Krizhevsky (ref45) 2009 Chen (ref50) 2017 Xie (ref30) Kim (ref3) 2024 Cohen (ref37) ref24 Hong (ref21) ref26 ref22 Zhao (ref4) 2024 ref28 ref27 ref29 Zhang (ref36) 2022 Platt (ref55) 1999; 10
References_xml	– ident: ref8 doi: 10.1109/TDSC.2023.3263507 – volume: 10 start-page: 61 issue: 3 year: 1999 ident: ref55 article-title: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods publication-title: Adv. Large Margin Classifiers – ident: ref49 doi: 10.23919/DATE48585.2020.9116489 – ident: ref44 doi: 10.1016/j.neunet.2012.02.016 – ident: ref39 doi: 10.1109/TBDATA.2019.2921572 – year: 2024 ident: ref4 article-title: PolyModel for Hedge Funds’ portfolio construction using machine learning publication-title: arXiv:2412.11019 – ident: ref57 doi: 10.1109/SP54263.2024.00015 – ident: ref51 doi: 10.1145/3474369.3486874 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref53 article-title: Network in network – start-page: 8068 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref21 article-title: Handcrafted backdoors in deep neural networks – ident: ref6 doi: 10.1109/ACCESS.2023.3245570 – start-page: 1310 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref37 article-title: Certified adversarial robustness via randomized smoothing – ident: ref54 doi: 10.1109/CVPR.2014.244 – start-page: 11372 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref30 article-title: CRFL: Certifiably robust federated learning against backdoor attacks – ident: ref16 doi: 10.1109/ICCV48922.2021.01617 – year: 2024 ident: ref3 article-title: Face-GPS: A comprehensive technique for quantifying facial muscle dynamics in videos publication-title: arXiv:2401.05625 – ident: ref19 doi: 10.1109/ICCV48922.2021.01615 – year: 2017 ident: ref50 article-title: Targeted backdoor attacks on deep learning systems using data poisoning publication-title: arXiv:1712.05526 – ident: ref35 doi: 10.1109/ACCESS.2022.3207839 – ident: ref46 doi: 10.1109/CVPR.2011.5995566 – ident: ref17 doi: 10.1145/3359789.3359790 – ident: ref41 doi: 10.1109/ICDM.2008.17 – year: 2009 ident: ref45 article-title: Learning multiple layers of features from tiny images – ident: ref22 doi: 10.1109/TDSC.2020.3028448 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref23 article-title: WaNet–imperceptible warping-based backdoor attack – ident: ref40 doi: 10.1162/089976601750264965 – ident: ref48 doi: 10.1109/ACCESS.2020.3032411 – ident: ref9 doi: 10.1109/ACCESS.2019.2909068 – ident: ref27 doi: 10.1109/ACCESS.2022.3141077 – ident: ref5 doi: 10.1109/CCTA41146.2020.9206312 – ident: ref13 doi: 10.1145/3319535.3363216 – start-page: 31474 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref31 article-title: BagFlip: A certified defense against data poisoning – year: 2024 ident: ref2 article-title: Hedge fund portfolio construction using PolyModel theory and iTransformer publication-title: arXiv:2408.03320 – ident: ref52 doi: 10.1109/CVPR.2016.90 – ident: ref58 doi: 10.1145/3658644.3670361 – ident: ref26 doi: 10.1007/978-3-031-20065-6_11 – ident: ref29 doi: 10.1609/aaai.v36i9.21191 – ident: ref42 doi: 10.1145/342009.335388 – ident: ref34 doi: 10.1109/ICCV.2017.74 – year: 2010 ident: ref43 publication-title: MNIST Handwritten Digit Database – ident: ref33 doi: 10.1109/SPW50608.2020.00025 – start-page: 3454 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref20 article-title: Input-aware dynamic backdoor attack – volume: 32 start-page: 14004 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref25 article-title: Defending neural backdoors via generative distribution modeling – ident: ref32 doi: 10.1145/3427228.3427264 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref15 article-title: AEVA: Black-box backdoor detection using adversarial extreme value analysis – ident: ref47 doi: 10.1109/CVPR.2009.5206848 – ident: ref14 doi: 10.1016/j.cose.2021.102277 – ident: ref56 doi: 10.29172/7c2a6982-6d72-4cd8-bba6-2fccb06a7011 – ident: ref18 doi: 10.1109/TIFS.2023.3297056 – ident: ref12 doi: 10.1007/978-3-030-00470-5_13 – ident: ref24 doi: 10.1109/SP.2019.00031 – ident: ref7 doi: 10.21314/JCF.2023.003 – ident: ref10 doi: 10.14722/ndss.2018.23291 – ident: ref11 doi: 10.1109/ICCV48922.2021.01175 – ident: ref28 doi: 10.1109/CVPR52688.2022.01465 – year: 2022 ident: ref36 article-title: Adversarial samples for deep monocular 6D object pose estimation publication-title: arXiv:2203.00302 – start-page: 1 volume-title: Proc. NeurIPS Workshop Dataset Curation Secur. ident: ref1 article-title: Bait and switch: Online training data poisoning of autonomous driving systems – ident: ref38 doi: 10.1145/377939.377946
SSID	ssj0000816957
Score	2.3344336
Snippet	The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a...
SourceID	doaj crossref ieee
SourceType	Open Website Index Database Publisher
StartPage	36099
SubjectTerms	Closed box Detectors Glass box Neural network backdoors Neural networks Noise novelty detection output resiliency Perturbation methods Resilience Robustness Strips Training
SummonAdditionalLinks	– databaseName: IEEE Electronic Library (IEL) dbid: RIE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1La9wwEBbZnJJD0jQJ2SQNOvQYb21LWtvHfTSEHlwoXcihYPQYwbKLFbLaUvrrO5KdkC0EchOSwZJGj29mNN8Q8rmyNtfClAlefyzhWS5xS5XBUyhVXhVK2I7tsx7fL_i3B_HQB6vHWBgAiI_PYBSK0ZdvnN4GUxnu8BK1C1ENyAA1ty5Y68WgEjJIVKLomYWytPoymc1wEKgD5mLEBGeMsZ3bJ5L072RViZfK3TGpn7vTvSVZjbZejfTf_5ga393fD-Soh5d00q2HE7IH7Udy-Ip08JT8mkNwHGCZTtbrxLvkewt0KvXKOPdEJ96HsHu6bGk07iVT94fO63pDfy8lnff5VPBcWNMfTm03PpyV1Dtau-UGzsji7uvP2X3S51hINOpOPhFWskDRIqXKoEwLrQ1wnRlEASorFAORFzkXodXk3KbYwjOlS4lQELGaZudkv3UtXBAKWKnTTKdVqbgAUFJZpcdGhTAhxAFDcvs8981jR6XRRBUkrZpOVE0QVdOLakimQT4vnwYe7FiBU9z026ox1liBoEhbxCkgSoWIysK4zHlR4QDVkJwFsbz6XyeRyzfqr8hB6ENnY7km-_5pC58QdXh1E1fbP0ZS06E priority: 102 providerName: IEEE
Title	Detecting All-to-One Backdoor Attacks in Black-Box DNNs via Differential Robustness to Noise
URI	https://ieeexplore.ieee.org/document/10891759 https://doaj.org/article/dfdf5741cf124e58b065fe682479c26b
Volume	13
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELZQJxgQjyLKo_LASGj8SpwxbakqhiIhKnVAimzHliqqBLUp4udzTgJqJxa2yI4S-y72d3fOfYfQXeIcNSKXAcAfCzihCpaU9CeFStMk1sI1bJ-zaDrnTwux2Cn15f8Ja-iBG8ENcpc7AbBnHCCRFVIDZjobScrjxNBI-90XMG_Hmar3YEmiRMQtzRAJk0E6GsGMwCGk4oEJzhhje1BUM_bvlVipEWZygo5b0xCnzZBO0YEtztDRDmHgOXobWx_0h2ucrlZBVQbPhcVDZd7zslzjtKp8yjxeFrgOzAXD8guPZ7MN_lwqPG5rocCaXuGXUm83ld_ncFXiWbnc2C6aTx5fR9OgrY8QwMRJFQinmKdXUUoTK8PYmNxyQ3JAcE1izaygMeXC9-aUuxB6ONFGKjDjwM4y7AJ1irKwlwhbaDQhMWEiNRfWaqWdNlGufYoPYHgP3f-IKvtoaDCy2n0Ik6yRbOYlm7WS7aGhF-fvrZ7Dum4AzWatZrO_NNtDXa-MnfdJ8C1FcvUfD79Gh37ATTDlBnWq9dbegnlR6X79JfXrTMBvR_nKrA
linkProvider	Directory of Open Access Journals
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1baxQxFA5aH9QHry2utZoHH511Mkl2Zh730rJqHUFa6IMw5HICS5eJdLNS-ut7kpmWVhB8C8nAJPmSnEtyvkPIx9q5wkhbZSj-eCZYoXBLVfGmUOmiLrV0PdtnM1meiq9n8mwIVk-xMACQHp_BOBbTXb71ZhtdZbjDK7QuZP2QPELBL1kfrnXrUok5JGpZDtxCLK8_T-dzHAZagYUccyk45_ye_Ek0_ffyqiSxcvScNDcd6l-TnI-3QY_N1V9cjf_d4xfk2aBg0mm_Il6SB9C9Ik_v0A6-Jr8WEK8OsEyn63UWfPajAzpT5tx6f0GnIcTAe7rqaHLvZTN_SRdNs6F_VoouhowqeDKs6U-vt5sQT0saPG38agO75PTo8GS-zIYsC5lB6ylk0ikeSVqU0gyqvDTGgjDMoh6gWak5yKIshIytthAuxxbBtKlUHiFghu-Rnc538IZQwEqTM5PXlRYSQCvttJlYHQOFUBMYkU83c9_-7sk02mSE5HXbQ9VGqNoBqhGZRXxuP41M2KkCp7gdNlZrnXUS1SLjUFMBWWnUqRxMqkKUNQ5Qj8huhOXO_3pE3v6j_gN5vDz5ftwef2m-7ZMnsT-9x-Ud2QkXWzhAHSTo92nlXQN6Q9bq
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Detecting+All-to-One+Backdoor+Attacks+in+Black-Box+DNNs+via+Differential+Robustness+to+Noise&rft.jtitle=IEEE+access&rft.au=Fu%2C+Hao&rft.au=Krishnamurthy%2C+Prashanth&rft.au=Garg%2C+Siddharth&rft.au=Khorrami%2C+Farshad&rft.date=2025&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=13&rft.spage=36099&rft.epage=36111&rft_id=info:doi/10.1109%2FACCESS.2025.3543333&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2025_3543333
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon