框开的鲁棒性：构图表示自然而然地防止黑盒补丁攻击

论文标题

框开的鲁棒性：构图表示自然而然地防止黑盒补丁攻击

Robustness Out of the Box: Compositional Representations Naturally Defend Against Black-Box Patch Attacks

论文作者

Cosgrove, Christian, Kortylewski, Adam, Yang, Chenglin, Yuille, Alan

论文摘要

基于补丁的对抗性攻击引起了引起错误分类的输入的可感知但本地化的变化。尽管在防御不可察觉的攻击方面取得了进展，但尚不清楚如何抵抗基于补丁的攻击。在这项工作中，我们研究了两种不同的方法，以防御黑盒补丁攻击。首先，我们表明，针对不可察觉的攻击成功的对抗训练对最先进的位置优化补丁攻击的有效性有限。其次，我们发现成分深网具有基于部分表示的零件表示，从而使自然闭塞具有固有的稳健性，它对Pascal3D+的补丁攻击和德国交通标志识别基准非常有力，而无需对抗训练。此外，构图模型的鲁棒性比较大的边缘优于受过对抗训练的标准模型的鲁棒性。但是，在GTSRB上，我们观察到它们在区分具有细粒度差异的类似交通标志方面存在问题。我们通过引入基于零件的填充来克服这一限制，从而改善了细粒度的识别。通过利用构图表示，这是防御黑盒补丁攻击而无需昂贵的对抗训练的第一项工作。这种防御比对抗性训练更强大，并且更容易解释，因为它可以定位和忽略对抗性斑块。

Patch-based adversarial attacks introduce a perceptible but localized change to the input that induces misclassification. While progress has been made in defending against imperceptible attacks, it remains unclear how patch-based attacks can be resisted. In this work, we study two different approaches for defending against black-box patch attacks. First, we show that adversarial training, which is successful against imperceptible attacks, has limited effectiveness against state-of-the-art location-optimized patch attacks. Second, we find that compositional deep networks, which have part-based representations that lead to innate robustness to natural occlusion, are robust to patch attacks on PASCAL3D+ and the German Traffic Sign Recognition Benchmark, without adversarial training. Moreover, the robustness of compositional models outperforms that of adversarially trained standard models by a large margin. However, on GTSRB, we observe that they have problems discriminating between similar traffic signs with fine-grained differences. We overcome this limitation by introducing part-based finetuning, which improves fine-grained recognition. By leveraging compositional representations, this is the first work that defends against black-box patch attacks without expensive adversarial training. This defense is more robust than adversarial training and more interpretable because it can locate and ignore adversarial patches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题