Removing Watermarks For Image Processing Networks Via Referenced Subspace Attention

Deep neural network model extraction attack is the process of retraining a surrogate model based on the outputs of a target model with a given set of inputs. Such attacks are hard to defend for the sake of model owners’ interest. Recently, some work propose model watermarking scheme for image proces...

Full description

Saved in:
Bibliographic Details
Published inComputer journal Vol. 67; no. 2; pp. 498 - 507
Main Authors Xue, Yuliang, Zhu, Yuhao, Zhu, Zhiying, Li, Sheng, Qian, Zhenxing, Zhang, Xinpeng
Format Journal Article
LanguageEnglish
Published Oxford University Press 17.02.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Deep neural network model extraction attack is the process of retraining a surrogate model based on the outputs of a target model with a given set of inputs. Such attacks are hard to defend for the sake of model owners’ interest. Recently, some work propose model watermarking scheme for image processing networks, which is able to prove the intellectual property of deep models even after the model extraction attack. This scheme makes sure that, once the target model (an image processing network) is watermarked, we can extract the watermark from the output of the surrogate model. In this paper, we propose a new model extraction attack scheme to fight against the latest method. Instead of directly using the output images of a target model, we propose to use their reconstructed versions for model retraining, where an asymmetrical UNet is proposed for image reconstruction. To thoroughly remove the watermarking traces, we propose and incorporate a referenced subspace attention module in the asymmetrical UNet, which removes the watermark by projecting the outputs of the target model into the subspaces of the reference image. Various experiments demonstrate the effectiveness of our attack.
ISSN:0010-4620
1460-2067
DOI:10.1093/comjnl/bxac190