论文标题
注意力分散注意力:通过持续学习的选择性忘记,去除水印
Attention Distraction: Watermark Removal Through Continual Learning with Selective Forgetting
论文作者
论文摘要
微调攻击可有效去除深度学习模型中的嵌入式水印。但是,当源数据不可用时,只需删除水印即可而不会损害模型性能。在这种情况下,我们引入了注意力分散注意力(AD),这是一种新型的无数据源水印去除攻击,以通过自定义持续学习选择性地忘记嵌入式水印。特别是,AD首先使用一些未标记的数据将模型的注意力锚定在主要任务上。然后,通过不断学习,分配了一个新标签的少数\ textit {lures}(随机选择的自然图像)分散了模型的注意力,将其关注。来自不同数据集和网络的实验结果证实了广告可以用较小的资源预算彻底删除水印,而不会损害模型在主要任务上的性能,这表现优于最先进的作品。
Fine-tuning attacks are effective in removing the embedded watermarks in deep learning models. However, when the source data is unavailable, it is challenging to just erase the watermark without jeopardizing the model performance. In this context, we introduce Attention Distraction (AD), a novel source data-free watermark removal attack, to make the model selectively forget the embedded watermarks by customizing continual learning. In particular, AD first anchors the model's attention on the main task using some unlabeled data. Then, through continual learning, a small number of \textit{lures} (randomly selected natural images) that are assigned a new label distract the model's attention away from the watermarks. Experimental results from different datasets and networks corroborate that AD can thoroughly remove the watermark with a small resource budget without compromising the model's performance on the main task, which outperforms the state-of-the-art works.