具有隐式神经表征的阻碍对抗性攻击

论文标题

具有隐式神经表征的阻碍对抗性攻击

Hindering Adversarial Attacks with Implicit Neural Representations

论文作者

Rusu, Andrei A., Calian, Dan A., Gowal, Sven, Hadsell, Raia

论文摘要

我们介绍了有损的隐式网络激活编码（LINAC）防御，这是一种输入转换，成功阻碍了对CIFAR的几次常见对抗性攻击 - $ 10 $分类器的分类器，用于扰动，最高$ε= 8/255 $ in $ l_ \ l_ \ infty $ norm norm norm norm norm norm norm norm $ l_2 $ norm $ l_2 $ norm $ _2 $ norm norm。隐式神经表示形式用于大约用$ 2 \ text {d} $图像编码像素颜色强度，以便对经过转换的数据培训的分类器似乎对小扰动具有稳健性，而没有对抗性训练或性能大量下降。随机数生成器的种子用于初始化和训练隐式神经表示形式是对更强烈的通用攻击的必要信息，这表明其作为私钥的作用。我们为基于密钥的防御措施设计了一个参数旁路近似（PBA）攻击策略，该策略成功地使该类别中的现有方法无效。有趣的是，我们的Linac防御也阻碍了一些转移和适应性攻击，包括我们的新型PBA策略。我们的结果强调了尽管根据标准评估显然具有鲁棒性，但广泛的定制攻击的重要性。在本提交中评估的列亚克源代码和参数可用：https：//github.com/deepmind/linac

We introduce the Lossy Implicit Network Activation Coding (LINAC) defence, an input transformation which successfully hinders several common adversarial attacks on CIFAR-$10$ classifiers for perturbations up to $ε= 8/255$ in $L_\infty$ norm and $ε= 0.5$ in $L_2$ norm. Implicit neural representations are used to approximately encode pixel colour intensities in $2\text{D}$ images such that classifiers trained on transformed data appear to have robustness to small perturbations without adversarial training or large drops in performance. The seed of the random number generator used to initialise and train the implicit neural representation turns out to be necessary information for stronger generic attacks, suggesting its role as a private key. We devise a Parametric Bypass Approximation (PBA) attack strategy for key-based defences, which successfully invalidates an existing method in this category. Interestingly, our LINAC defence also hinders some transfer and adaptive attacks, including our novel PBA strategy. Our results emphasise the importance of a broad range of customised attacks despite apparent robustness according to standard evaluations. LINAC source code and parameters of defended classifier evaluated throughout this submission are available: https://github.com/deepmind/linac

下载PDF全文

下载文献需遵守相关版权规定

论文标题