论文标题

有多少扰动破坏了这一模型?评估超越对抗精度的鲁棒性

How many perturbations break this model? Evaluating robustness beyond adversarial accuracy

论文作者

Olivier, Raphael, Raj, Bhiksha

论文摘要

对对抗性攻击的鲁棒性通常以对抗精度进行评估。虽然必不可少的,但该指标并未捕获鲁棒性的所有方面,尤其是省略了每个点可以找到多少扰动的问题。在这项工作中,我们引入了一种替代方法,即对抗性的稀疏性,该方法量化了在输入点和对扰动方向的限制下,找到成功的扰动的困难。我们表明,稀疏性通过多种方式提供了对神经网络的宝贵见解:例如,它说明了当前最新的鲁棒模型之间的重要差异,即准确性分析没有,并建议改善其稳健性的方法。当对弱攻击有效地施加损坏的防御措施时,稀疏性可以区分完全无效的防御和部分有效的防御措施。最后,借助稀疏性,我们可以衡量不影响准确性的鲁棒性的增加:例如,我们表明数据增强本身可以增加对抗性鲁棒性,而无需使用对抗性训练。

Robustness to adversarial attacks is typically evaluated with adversarial accuracy. While essential, this metric does not capture all aspects of robustness and in particular leaves out the question of how many perturbations can be found for each point. In this work, we introduce an alternative approach, adversarial sparsity, which quantifies how difficult it is to find a successful perturbation given both an input point and a constraint on the direction of the perturbation. We show that sparsity provides valuable insight into neural networks in multiple ways: for instance, it illustrates important differences between current state-of-the-art robust models them that accuracy analysis does not, and suggests approaches for improving their robustness. When applying broken defenses effective against weak attacks but not strong ones, sparsity can discriminate between the totally ineffective and the partially effective defenses. Finally, with sparsity we can measure increases in robustness that do not affect accuracy: we show for example that data augmentation can by itself increase adversarial robustness, without using adversarial training.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源