DP2Dataset Protection by Data Poisoning

Data poisoning can be served as an effective way to protect the dataset from surrogate training, whereby the performance of the surrogate model could be greatly influenced if trained with poisoned dataset. This paper focuses on an advanced scenario where the attacker might be an experienced maliciou...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on dependable and secure computing Vol. 21; no. 2; pp. 636 - 649
Main Authors	Fang, Han, Qiu, Yupeng, Qin, Guorui, Zhang, Jiyi, Chen, Kejiang, Zhang, Weiming, Chang, Ee-Chien
Format	Journal Article
Language	English
Published	Washington IEEE 01.03.2024 IEEE Computer Society
Subjects	Biological system modeling Black boxes Business models Data models Data poisoning dataset protection Datasets Distortion imperceptibility Perturbation methods Recoverability Robustness stealthiness surrogate model Training Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Data poisoning can be served as an effective way to protect the dataset from surrogate training, whereby the performance of the surrogate model could be greatly influenced if trained with poisoned dataset. This paper focuses on an advanced scenario where the attacker might be an experienced malicious employee who has the white-box access to the dataset and black-box access (can only query) to original business model (e.g. MLaaS model). Under this condition, three main requirements must be satisfied: imperceptibility, robustness and stealthiness. In this paper, we propose a novel dataset protection method by data poisoning dubbed DP 2 to meet the requirements. To achieve imperceptibility and robustness, we propose a poisoning mechanism which is conducted by a designed dual-U-Net-based poisoning network, by training with the reference mapping strategy and the corresponding noise layer, the imperceptibility and robustness can be both achieved. As for stealthiness, we propose a recover-net to eliminate the perturbation, so that the business model with black-box access could be an enclose version of the recover-net and the original business model. Besides, based on the recover-net, the poisoned dataset could be re-applied for the normal use. Various experiments indicate superior performance of the proposed scheme in the view of imperceptibility and robustness compared with other schemes. The solution which makes the poisoned data recoverable greatly ensures the stealthiness, and the derived recoverability of poisoned data could be utilized in other scenarios.
ISSN:	1545-5971 1941-0018
DOI:	10.1109/TDSC.2022.3227945