利用$ l_2 $对抗性示例的灵敏度擦除和储备

论文标题

利用$ l_2 $对抗性示例的灵敏度擦除和储备

Exploiting the Sensitivity of $L_2$ Adversarial Examples to Erase-and-Restore

论文作者

Zuo, Fei, Zeng, Qiang

论文摘要

通过在输入图像中添加精心设计的扰动，可以生成对抗性示例（AES），以误导基于神经网络的图像分类器。 $ L_2 $ Carlini和Wagner（CW）的对抗性扰动是最有效但难以检测的攻击之一。尽管已经提出了许多针对AE的对策，但自适应CW- $ L_2 $ AE的检测仍然是一个悬而未决的问题。我们发现，通过在$ L_2 $ AE中随机擦除一些像素，然后使用介入技术恢复，AE在步骤之前和之后，AE倾向于具有不同的分类结果，而良性样本并未显示此症状。因此，我们提出了一种新颖的AE检测技术，即Erase-and-Restore（E＆R），该技术利用了$ L_2 $攻击的有趣敏感性。在两个流行的图像数据集CIFAR-10和Imagenet上进行的实验表明，该提出的技术能够检测到$ L_2 $ AES的98％以上，并且对良性图像的假阳性速率非常低。检测技术具有高传递性：使用CW-$ L_2 $ AE训练的检测系统可以准确地检测使用另一种$ L_2 $攻击方法生成的AE。更重要的是，我们的方法证明了对自适应$ L_2 $攻击的强烈韧性，从而填补了AE检测的关键空白。最后，我们通过可视化和定量来解释检测技术。

By adding carefully crafted perturbations to input images, adversarial examples (AEs) can be generated to mislead neural-network-based image classifiers. $L_2$ adversarial perturbations by Carlini and Wagner (CW) are among the most effective but difficult-to-detect attacks. While many countermeasures against AEs have been proposed, detection of adaptive CW-$L_2$ AEs is still an open question. We find that, by randomly erasing some pixels in an $L_2$ AE and then restoring it with an inpainting technique, the AE, before and after the steps, tends to have different classification results, while a benign sample does not show this symptom. We thus propose a novel AE detection technique, Erase-and-Restore (E&R), that exploits the intriguing sensitivity of $L_2$ attacks. Experiments conducted on two popular image datasets, CIFAR-10 and ImageNet, show that the proposed technique is able to detect over 98% of $L_2$ AEs and has a very low false positive rate on benign images. The detection technique exhibits high transferability: a detection system trained using CW-$L_2$ AEs can accurately detect AEs generated using another $L_2$ attack method. More importantly, our approach demonstrates strong resilience to adaptive $L_2$ attacks, filling a critical gap in AE detection. Finally, we interpret the detection technique through both visualization and quantification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题