偏转对抗性攻击

论文标题

偏转对抗性攻击

Deflecting Adversarial Attacks

论文作者

Qin, Yao, Frosst, Nicholas, Raffel, Colin, Cottrell, Garrison, Hinton, Geoffrey

论文摘要

在一个持续的周期中，更强大的防御攻击随后被更先进的防御感攻击打破了。我们提出了一种新的方法来结束本周期，在该方法中，我们通过导致攻击者产生一种在语义上与攻击类似于攻击的目标类别的输入来“偏转”对抗性攻击。为此，我们首先提出了基于胶囊网络的更强的防御，该胶囊网络结合了三个检测机制，以实现对标准和防御攻击的攻击，我们将对攻击进行了攻击。对抗目标类别通过进行人类研究，要求参与者标记这些攻击图像所产生的图像。

There has been an ongoing cycle where stronger defenses against adversarial attacks are subsequently broken by a more advanced defense-aware attack. We present a new approach towards ending this cycle where we "deflect'' adversarial attacks by causing the attacker to produce an input that semantically resembles the attack's target class. To this end, we first propose a stronger defense based on Capsule Networks that combines three detection mechanisms to achieve state-of-the-art detection performance on both standard and defense-aware attacks. We then show that undetected attacks against our defense often perceptually resemble the adversarial target class by performing a human study where participants are asked to label images produced by the attack. These attack images can no longer be called "adversarial'' because our network classifies them the same way as humans do.

下载PDF全文

下载文献需遵守相关版权规定

论文标题