大约针对多个对抗性扰动的大概歧管防御

论文标题

大约针对多个对抗性扰动的大概歧管防御

Approximate Manifold Defense Against Multiple Adversarial Perturbations

论文作者

Nandy, Jay, Hsu, Wynne, Lee, Mong Li

论文摘要

现有针对对抗攻击的防御措施通常针对特定的扰动类型量身定制。使用对抗性训练来防御多种类型的扰动，需要在每个训练步骤中来自不同扰动类型的昂贵对抗示例。相比之下，基于歧管的防御结合了一个生成网络，将输入样本投射到干净的数据歧管上。这种方法消除了产生昂贵的对抗性实例的需求，同时实现了对多种扰动类型的鲁棒性。但是，这种方法的成功取决于生成网络是否可以捕获完整的清洁数据歧管，这仍然是复杂输入域的开放问题。在这项工作中，我们设计了一种称为RBF-CNN的大约歧管防御机制，用于图像分类。我们没有捕获完整的数据歧管，而是使用RBF层来学习小图像补丁的密度。 RBF-CNN还利用一个重构层来减轻任何较小的对抗扰动。此外，结合我们提出的培训的重建过程，可以改善RBF-CNN模型的对抗性鲁棒性。 MNIST和CIFAR-10数据集的实验结果表明，RBF-CNN为多次扰动提供了鲁棒性，而无需昂贵的对抗性训练。

Existing defenses against adversarial attacks are typically tailored to a specific perturbation type. Using adversarial training to defend against multiple types of perturbation requires expensive adversarial examples from different perturbation types at each training step. In contrast, manifold-based defense incorporates a generative network to project an input sample onto the clean data manifold. This approach eliminates the need to generate expensive adversarial examples while achieving robustness against multiple perturbation types. However, the success of this approach relies on whether the generative network can capture the complete clean data manifold, which remains an open problem for complex input domain. In this work, we devise an approximate manifold defense mechanism, called RBF-CNN, for image classification. Instead of capturing the complete data manifold, we use an RBF layer to learn the density of small image patches. RBF-CNN also utilizes a reconstruction layer that mitigates any minor adversarial perturbations. Further, incorporating our proposed reconstruction process for training improves the adversarial robustness of our RBF-CNN models. Experiment results on MNIST and CIFAR-10 datasets indicate that RBF-CNN offers robustness for multiple perturbations without the need for expensive adversarial training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题