IBD: An Interpretable Backdoor-Detection Method via Multivariate Interactions

Recent work has shown that deep neural networks are vulnerable to backdoor attacks. In comparison with the success of backdoor-attack methods, existing backdoor-defense methods face a lack of theoretical foundations and interpretable solutions. Most defense methods are based on experience with the c...

Full description

Saved in:

Bibliographic Details
Published in	Sensors (Basel, Switzerland) Vol. 22; no. 22; p. 8697
Main Authors	Xu, Yixiao, Liu, Xiaolei, Ding, Kangyi, Xin, Bangzhou
Format	Journal Article
Language	English
Published	Switzerland MDPI AG 10.11.2022 MDPI
Subjects	Algorithms backdoor detection Classification Datasets deep neural network Defense Game theory Humans Inflammatory Bowel Diseases - diagnosis Information Theory interpretable deep learning Methods Multivariate analysis Neural networks Neural Networks, Computer Sensors backdoor detection interpretable deep learning deep neural network
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Recent work has shown that deep neural networks are vulnerable to backdoor attacks. In comparison with the success of backdoor-attack methods, existing backdoor-defense methods face a lack of theoretical foundations and interpretable solutions. Most defense methods are based on experience with the characteristics of previous attacks, but fail to defend against new attacks. In this paper, we propose IBD, an interpretable backdoor-detection method via multivariate interactions. Using information theory techniques, IBD reveals how the backdoor works from the perspective of multivariate interactions of features. Based on the interpretable theorem, IBD enables defenders to detect backdoor models and poisoned examples without introducing additional information about the specific attack method. Experiments on widely used datasets and models show that IBD achieves a 78% increase in average in detection accuracy and an order-of-magnitude reduction in time cost compared with existing backdoor-detection methods.
ISSN:	1424-8220 1424-8220
DOI:	10.3390/s22228697