使用自动编码器和遗传算法的语义保留对抗攻击产生

论文标题

使用自动编码器和遗传算法的语义保留对抗攻击产生

Semantic Preserving Adversarial Attack Generation with Autoencoder and Genetic Algorithm

论文作者

Wang, Xinyi, Enoch, Simon Yusuf, Kim, Dong Seong

论文摘要

发现广泛使用的深度学习模型的鲁棒性差。几乎没有噪音可以欺骗最先进的模型来做出错误的预测。尽管有大量的高性能攻击生成方法，但其中大多数直接在原始数据中添加了扰动，并使用L_P规范对其进行测量；这可能会破坏数据的主要结构，从而产生无效的攻击。在本文中，我们提出了一个黑盒攻击，该攻击不是修改原始数据，而是修改由自动编码器提取的数据的潜在特征；然后，我们测量语义空间中的噪音以保护数据的语义。我们在MNIST和CIFAR-10数据集上训练了自动编码器，并使用遗传算法发现了最佳的对抗扰动。我们的方法在MNIST和CIFAR-10数据集的前100个数据上获得了100％的攻击成功率，而扰动率较小。

Widely used deep learning models are found to have poor robustness. Little noises can fool state-of-the-art models into making incorrect predictions. While there is a great deal of high-performance attack generation methods, most of them directly add perturbations to original data and measure them using L_p norms; this can break the major structure of data, thus, creating invalid attacks. In this paper, we propose a black-box attack, which, instead of modifying original data, modifies latent features of data extracted by an autoencoder; then, we measure noises in semantic space to protect the semantics of data. We trained autoencoders on MNIST and CIFAR-10 datasets and found optimal adversarial perturbations using a genetic algorithm. Our approach achieved a 100% attack success rate on the first 100 data of MNIST and CIFAR-10 datasets with less perturbation than FGSM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题