Detecting All-to-One Backdoor Attacks in Black-Box DNNs via Differential Robustness to Noise

The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a white-box context, necessitating access to the backdoored model's architecture, hidden layer outputs, or internal parameters. The necessity for...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 13; pp. 36099 - 36111
Main Authors Fu, Hao, Krishnamurthy, Prashanth, Garg, Siddharth, Khorrami, Farshad
Format Journal Article
LanguageEnglish
Published IEEE 2025
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a white-box context, necessitating access to the backdoored model's architecture, hidden layer outputs, or internal parameters. The necessity for black-box A2O backdoor defenses arises, particularly in scenarios where only the network's input and output are accessible. However, prevalent black-box A2O backdoor defenses often mandate assumptions regarding the locations of triggers, as they leverage hand-crafted features for detection. In instances where triggers deviate from these assumptions, the resultant hand-crafted features diminish in quality, rendering these methods ineffective. To address this issue, this work proposes a post-training black-box A2O backdoor defense that maintains consistent efficacy regardless of the triggers' locations. Our method hinges on the empirical observation that, in the context of A2O backdoor attacks, poisoned samples are more resilient to uniform noise than clean samples in terms of the network output. Specifically, our approach uses a metric to quantify the resiliency of the given input to the uniform noise. A novelty detector, trained utilizing the quantified resiliency of available clean samples, is deployed to discern whether the given input is poisoned. The novelty detector is evaluated across various triggers. Our approach is effective on all utilized triggers. Lastly, an explanation is provided for our observation.
AbstractList The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a white-box context, necessitating access to the backdoored model's architecture, hidden layer outputs, or internal parameters. The necessity for black-box A2O backdoor defenses arises, particularly in scenarios where only the network's input and output are accessible. However, prevalent black-box A2O backdoor defenses often mandate assumptions regarding the locations of triggers, as they leverage hand-crafted features for detection. In instances where triggers deviate from these assumptions, the resultant hand-crafted features diminish in quality, rendering these methods ineffective. To address this issue, this work proposes a post-training black-box A2O backdoor defense that maintains consistent efficacy regardless of the triggers' locations. Our method hinges on the empirical observation that, in the context of A2O backdoor attacks, poisoned samples are more resilient to uniform noise than clean samples in terms of the network output. Specifically, our approach uses a metric to quantify the resiliency of the given input to the uniform noise. A novelty detector, trained utilizing the quantified resiliency of available clean samples, is deployed to discern whether the given input is poisoned. The novelty detector is evaluated across various triggers. Our approach is effective on all utilized triggers. Lastly, an explanation is provided for our observation.
Author Khorrami, Farshad
Fu, Hao
Krishnamurthy, Prashanth
Garg, Siddharth
Author_xml – sequence: 1
  givenname: Hao
  orcidid: 0000-0002-8282-6580
  surname: Fu
  fullname: Fu, Hao
  email: hf881@nyu.edu
  organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA
– sequence: 2
  givenname: Prashanth
  orcidid: 0000-0001-8264-7972
  surname: Krishnamurthy
  fullname: Krishnamurthy, Prashanth
  organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA
– sequence: 3
  givenname: Siddharth
  orcidid: 0000-0002-6158-9512
  surname: Garg
  fullname: Garg, Siddharth
  organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA
– sequence: 4
  givenname: Farshad
  orcidid: 0000-0002-8418-004X
  surname: Khorrami
  fullname: Khorrami, Farshad
  organization: Department of Electrical and Computer Engineering, New York University, Brooklyn, NY, USA
BookMark eNpNUNtKxDAUDKLgevkCfcgPdM21l8du1xvILrj6JoQ0PZGstZEmiv69WSvivJxhhhkOc4T2Bz8AQmeUzCkl1UXdNJebzZwRJudcCp6wh2aM5lXGJc_3__FDdBrCliSUSZLFDD0tIYKJbnjGdd9n0WfrAfBCm5fO-xHXMSYasBvwok8sW_hPvFytAv5wGi-dtTDCEJ3u8b1v30McIAQcPV55F-AEHVjdBzj9vcfo8eryobnJ7tbXt019lxmW05hJqzmRVaV1S6EkhTEdCEM7VvGWFi0HyQom5M7tmLAkOYK2ptRESEmp4cfodurtvN6qt9G96vFLee3Uj-DHZ6XH6EwPqrOdlYWgxlImQJYtyaWFvGSiqNI3beriU5cZfQgj2L8-StRubzXtrXZ7q9-9U-p8SjkA-JcoK1rIin8D7LB9OQ
CODEN IAECCG
Cites_doi 10.1109/TDSC.2023.3263507
10.23919/DATE48585.2020.9116489
10.1016/j.neunet.2012.02.016
10.1109/TBDATA.2019.2921572
10.1109/SP54263.2024.00015
10.1145/3474369.3486874
10.1109/ACCESS.2023.3245570
10.1109/CVPR.2014.244
10.1109/ICCV48922.2021.01617
10.1109/ICCV48922.2021.01615
10.1109/ACCESS.2022.3207839
10.1109/CVPR.2011.5995566
10.1145/3359789.3359790
10.1109/ICDM.2008.17
10.1109/TDSC.2020.3028448
10.1162/089976601750264965
10.1109/ACCESS.2020.3032411
10.1109/ACCESS.2019.2909068
10.1109/ACCESS.2022.3141077
10.1109/CCTA41146.2020.9206312
10.1145/3319535.3363216
10.1109/CVPR.2016.90
10.1145/3658644.3670361
10.1007/978-3-031-20065-6_11
10.1609/aaai.v36i9.21191
10.1145/342009.335388
10.1109/ICCV.2017.74
10.1109/SPW50608.2020.00025
10.1145/3427228.3427264
10.1109/CVPR.2009.5206848
10.1016/j.cose.2021.102277
10.29172/7c2a6982-6d72-4cd8-bba6-2fccb06a7011
10.1109/TIFS.2023.3297056
10.1007/978-3-030-00470-5_13
10.1109/SP.2019.00031
10.21314/JCF.2023.003
10.14722/ndss.2018.23291
10.1109/ICCV48922.2021.01175
10.1109/CVPR52688.2022.01465
10.1145/377939.377946
ContentType Journal Article
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
DOA
DOI 10.1109/ACCESS.2025.3543333
DatabaseName IEEE Xplore (IEEE)
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2169-3536
EndPage 36111
ExternalDocumentID oai_doaj_org_article_dfdf5741cf124e58b065fe682479c26b
10_1109_ACCESS_2025_3543333
10891759
Genre orig-research
GrantInformation_xml – fundername: Tamkeen under the NYUAD Research Institute Award
  grantid: CG010
– fundername: New York University Abu Dhabi (NYUAD) Center for Artificial Intelligence and Robotics (CAIR)
– fundername: Army Research Office
  grantid: W911NF-21-1-0155
  funderid: 10.13039/100000183
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
ABAZT
ABVLG
ACGFS
ADBBV
AGSQL
ALMA_UNASSIGNED_HOLDINGS
BCNDV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
ESBDL
GROUPED_DOAJ
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
OK1
RIA
RIE
RNS
AAYXX
CITATION
RIG
ID FETCH-LOGICAL-c261t-5fa30599aab1e807ccde4c1d293b17b3e527245ab1ed24f01d241bc8a045511c3
IEDL.DBID DOA
ISSN 2169-3536
IngestDate Wed Aug 27 01:30:56 EDT 2025
Tue Jul 01 05:26:37 EDT 2025
Wed Aug 27 01:46:39 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
License https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c261t-5fa30599aab1e807ccde4c1d293b17b3e527245ab1ed24f01d241bc8a045511c3
ORCID 0000-0002-8282-6580
0000-0002-8418-004X
0000-0002-6158-9512
0000-0001-8264-7972
OpenAccessLink https://doaj.org/article/dfdf5741cf124e58b065fe682479c26b
PageCount 13
ParticipantIDs doaj_primary_oai_doaj_org_article_dfdf5741cf124e58b065fe682479c26b
ieee_primary_10891759
crossref_primary_10_1109_ACCESS_2025_3543333
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20250000
2025-00-00
2025-01-01
PublicationDateYYYYMMDD 2025-01-01
PublicationDate_xml – year: 2025
  text: 20250000
PublicationDecade 2020
PublicationTitle IEEE access
PublicationTitleAbbrev Access
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
References ref13
Lin (ref53)
ref57
ref12
ref56
LeCun (ref43) 2010
ref14
ref58
Qiao (ref25); 32
ref52
ref11
ref10
ref54
ref17
Zhang (ref31)
ref16
ref19
ref18
Nguyen (ref23)
ref51
ref46
ref48
ref47
ref42
ref41
ref44
Guo (ref15)
ref49
ref8
ref7
Zhao (ref2) 2024
ref9
ref6
ref5
ref40
Nguyen (ref20)
ref35
ref34
ref33
ref32
Patel (ref1)
ref39
ref38
Krizhevsky (ref45) 2009
Chen (ref50) 2017
Xie (ref30)
Kim (ref3) 2024
Cohen (ref37)
ref24
Hong (ref21)
ref26
ref22
Zhao (ref4) 2024
ref28
ref27
ref29
Zhang (ref36) 2022
Platt (ref55) 1999; 10
References_xml – ident: ref8
  doi: 10.1109/TDSC.2023.3263507
– volume: 10
  start-page: 61
  issue: 3
  year: 1999
  ident: ref55
  article-title: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods
  publication-title: Adv. Large Margin Classifiers
– ident: ref49
  doi: 10.23919/DATE48585.2020.9116489
– ident: ref44
  doi: 10.1016/j.neunet.2012.02.016
– ident: ref39
  doi: 10.1109/TBDATA.2019.2921572
– year: 2024
  ident: ref4
  article-title: PolyModel for Hedge Funds’ portfolio construction using machine learning
  publication-title: arXiv:2412.11019
– ident: ref57
  doi: 10.1109/SP54263.2024.00015
– ident: ref51
  doi: 10.1145/3474369.3486874
– start-page: 1
  volume-title: Proc. Int. Conf. Learn. Represent.
  ident: ref53
  article-title: Network in network
– start-page: 8068
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref21
  article-title: Handcrafted backdoors in deep neural networks
– ident: ref6
  doi: 10.1109/ACCESS.2023.3245570
– start-page: 1310
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref37
  article-title: Certified adversarial robustness via randomized smoothing
– ident: ref54
  doi: 10.1109/CVPR.2014.244
– start-page: 11372
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref30
  article-title: CRFL: Certifiably robust federated learning against backdoor attacks
– ident: ref16
  doi: 10.1109/ICCV48922.2021.01617
– year: 2024
  ident: ref3
  article-title: Face-GPS: A comprehensive technique for quantifying facial muscle dynamics in videos
  publication-title: arXiv:2401.05625
– ident: ref19
  doi: 10.1109/ICCV48922.2021.01615
– year: 2017
  ident: ref50
  article-title: Targeted backdoor attacks on deep learning systems using data poisoning
  publication-title: arXiv:1712.05526
– ident: ref35
  doi: 10.1109/ACCESS.2022.3207839
– ident: ref46
  doi: 10.1109/CVPR.2011.5995566
– ident: ref17
  doi: 10.1145/3359789.3359790
– ident: ref41
  doi: 10.1109/ICDM.2008.17
– year: 2009
  ident: ref45
  article-title: Learning multiple layers of features from tiny images
– ident: ref22
  doi: 10.1109/TDSC.2020.3028448
– start-page: 1
  volume-title: Proc. Int. Conf. Learn. Represent.
  ident: ref23
  article-title: WaNet–imperceptible warping-based backdoor attack
– ident: ref40
  doi: 10.1162/089976601750264965
– ident: ref48
  doi: 10.1109/ACCESS.2020.3032411
– ident: ref9
  doi: 10.1109/ACCESS.2019.2909068
– ident: ref27
  doi: 10.1109/ACCESS.2022.3141077
– ident: ref5
  doi: 10.1109/CCTA41146.2020.9206312
– ident: ref13
  doi: 10.1145/3319535.3363216
– start-page: 31474
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref31
  article-title: BagFlip: A certified defense against data poisoning
– year: 2024
  ident: ref2
  article-title: Hedge fund portfolio construction using PolyModel theory and iTransformer
  publication-title: arXiv:2408.03320
– ident: ref52
  doi: 10.1109/CVPR.2016.90
– ident: ref58
  doi: 10.1145/3658644.3670361
– ident: ref26
  doi: 10.1007/978-3-031-20065-6_11
– ident: ref29
  doi: 10.1609/aaai.v36i9.21191
– ident: ref42
  doi: 10.1145/342009.335388
– ident: ref34
  doi: 10.1109/ICCV.2017.74
– year: 2010
  ident: ref43
  publication-title: MNIST Handwritten Digit Database
– ident: ref33
  doi: 10.1109/SPW50608.2020.00025
– start-page: 3454
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref20
  article-title: Input-aware dynamic backdoor attack
– volume: 32
  start-page: 14004
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref25
  article-title: Defending neural backdoors via generative distribution modeling
– ident: ref32
  doi: 10.1145/3427228.3427264
– start-page: 1
  volume-title: Proc. Int. Conf. Learn. Represent.
  ident: ref15
  article-title: AEVA: Black-box backdoor detection using adversarial extreme value analysis
– ident: ref47
  doi: 10.1109/CVPR.2009.5206848
– ident: ref14
  doi: 10.1016/j.cose.2021.102277
– ident: ref56
  doi: 10.29172/7c2a6982-6d72-4cd8-bba6-2fccb06a7011
– ident: ref18
  doi: 10.1109/TIFS.2023.3297056
– ident: ref12
  doi: 10.1007/978-3-030-00470-5_13
– ident: ref24
  doi: 10.1109/SP.2019.00031
– ident: ref7
  doi: 10.21314/JCF.2023.003
– ident: ref10
  doi: 10.14722/ndss.2018.23291
– ident: ref11
  doi: 10.1109/ICCV48922.2021.01175
– ident: ref28
  doi: 10.1109/CVPR52688.2022.01465
– year: 2022
  ident: ref36
  article-title: Adversarial samples for deep monocular 6D object pose estimation
  publication-title: arXiv:2203.00302
– start-page: 1
  volume-title: Proc. NeurIPS Workshop Dataset Curation Secur.
  ident: ref1
  article-title: Bait and switch: Online training data poisoning of autonomous driving systems
– ident: ref38
  doi: 10.1145/377939.377946
SSID ssj0000816957
Score 2.3344336
Snippet The all-to-one (A2O) backdoor attack is one of the major adversarial threats against neural networks. Most existing A2O backdoor defenses operate in a...
SourceID doaj
crossref
ieee
SourceType Open Website
Index Database
Publisher
StartPage 36099
SubjectTerms Closed box
Detectors
Glass box
Neural network backdoors
Neural networks
Noise
novelty detection
output resiliency
Perturbation methods
Resilience
Robustness
Strips
Training
SummonAdditionalLinks – databaseName: IEEE Electronic Library (IEL)
  dbid: RIE
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1La9wwEBbZnJJD0jQJ2SQNOvQYb21LWtvHfTSEHlwoXcihYPQYwbKLFbLaUvrrO5KdkC0EchOSwZJGj29mNN8Q8rmyNtfClAlefyzhWS5xS5XBUyhVXhVK2I7tsx7fL_i3B_HQB6vHWBgAiI_PYBSK0ZdvnN4GUxnu8BK1C1ENyAA1ty5Y68WgEjJIVKLomYWytPoymc1wEKgD5mLEBGeMsZ3bJ5L072RViZfK3TGpn7vTvSVZjbZejfTf_5ga393fD-Soh5d00q2HE7IH7Udy-Ip08JT8mkNwHGCZTtbrxLvkewt0KvXKOPdEJ96HsHu6bGk07iVT94fO63pDfy8lnff5VPBcWNMfTm03PpyV1Dtau-UGzsji7uvP2X3S51hINOpOPhFWskDRIqXKoEwLrQ1wnRlEASorFAORFzkXodXk3KbYwjOlS4lQELGaZudkv3UtXBAKWKnTTKdVqbgAUFJZpcdGhTAhxAFDcvs8981jR6XRRBUkrZpOVE0QVdOLakimQT4vnwYe7FiBU9z026ox1liBoEhbxCkgSoWIysK4zHlR4QDVkJwFsbz6XyeRyzfqr8hB6ENnY7km-_5pC58QdXh1E1fbP0ZS06E
  priority: 102
  providerName: IEEE
Title Detecting All-to-One Backdoor Attacks in Black-Box DNNs via Differential Robustness to Noise
URI https://ieeexplore.ieee.org/document/10891759
https://doaj.org/article/dfdf5741cf124e58b065fe682479c26b
Volume 13
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELZQJxgQjyLKo_LASGj8SpwxbakqhiIhKnVAimzHliqqBLUp4udzTgJqJxa2yI4S-y72d3fOfYfQXeIcNSKXAcAfCzihCpaU9CeFStMk1sI1bJ-zaDrnTwux2Cn15f8Ja-iBG8ENcpc7AbBnHCCRFVIDZjobScrjxNBI-90XMG_Hmar3YEmiRMQtzRAJk0E6GsGMwCGk4oEJzhhje1BUM_bvlVipEWZygo5b0xCnzZBO0YEtztDRDmHgOXobWx_0h2ucrlZBVQbPhcVDZd7zslzjtKp8yjxeFrgOzAXD8guPZ7MN_lwqPG5rocCaXuGXUm83ld_ncFXiWbnc2C6aTx5fR9OgrY8QwMRJFQinmKdXUUoTK8PYmNxyQ3JAcE1izaygMeXC9-aUuxB6ONFGKjDjwM4y7AJ1irKwlwhbaDQhMWEiNRfWaqWdNlGufYoPYHgP3f-IKvtoaDCy2n0Ik6yRbOYlm7WS7aGhF-fvrZ7Dum4AzWatZrO_NNtDXa-MnfdJ8C1FcvUfD79Gh37ATTDlBnWq9dbegnlR6X79JfXrTMBvR_nKrA
linkProvider Directory of Open Access Journals
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1baxQxFA5aH9QHry2utZoHH511Mkl2Zh730rJqHUFa6IMw5HICS5eJdLNS-ut7kpmWVhB8C8nAJPmSnEtyvkPIx9q5wkhbZSj-eCZYoXBLVfGmUOmiLrV0PdtnM1meiq9n8mwIVk-xMACQHp_BOBbTXb71ZhtdZbjDK7QuZP2QPELBL1kfrnXrUok5JGpZDtxCLK8_T-dzHAZagYUccyk45_ye_Ek0_ffyqiSxcvScNDcd6l-TnI-3QY_N1V9cjf_d4xfk2aBg0mm_Il6SB9C9Ik_v0A6-Jr8WEK8OsEyn63UWfPajAzpT5tx6f0GnIcTAe7rqaHLvZTN_SRdNs6F_VoouhowqeDKs6U-vt5sQT0saPG38agO75PTo8GS-zIYsC5lB6ylk0ikeSVqU0gyqvDTGgjDMoh6gWak5yKIshIytthAuxxbBtKlUHiFghu-Rnc538IZQwEqTM5PXlRYSQCvttJlYHQOFUBMYkU83c9_-7sk02mSE5HXbQ9VGqNoBqhGZRXxuP41M2KkCp7gdNlZrnXUS1SLjUFMBWWnUqRxMqkKUNQ5Qj8huhOXO_3pE3v6j_gN5vDz5ftwef2m-7ZMnsT-9x-Ud2QkXWzhAHSTo92nlXQN6Q9bq
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Detecting+All-to-One+Backdoor+Attacks+in+Black-Box+DNNs+via+Differential+Robustness+to+Noise&rft.jtitle=IEEE+access&rft.au=Fu%2C+Hao&rft.au=Krishnamurthy%2C+Prashanth&rft.au=Garg%2C+Siddharth&rft.au=Khorrami%2C+Farshad&rft.date=2025&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=13&rft.spage=36099&rft.epage=36111&rft_id=info:doi/10.1109%2FACCESS.2025.3543333&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2025_3543333
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon