论文标题
具有自适应步长的快速对抗训练
Fast Adversarial Training with Adaptive Step Size
论文作者
论文摘要
尽管对抗性训练及其变体已证明是防御对抗性攻击的最有效算法,但它们的极慢训练过程使得很难扩展到像Imagenet这样的大型数据集。加速对抗性训练的最新作品的关键思想是将多步攻击(例如PGD)用单步攻击(例如FGSM)代替。但是,这些单步方法患有灾难性的过度拟合,在训练过程中,针对PGD攻击的准确性突然下降到近0%,破坏了网络的鲁棒性。在这项工作中,我们从培训实例的角度研究了现象。我们表明,灾难性的过度拟合是实例依赖性的,并且具有较大梯度标准的拟合实例更有可能引起灾难性的过度拟合。根据我们的发现,我们提出了一种简单但有效的方法,具有自适应步长(ATA)的对抗性训练。 ATAS学习了一个与其梯度规范成反比的实例自适应步长大小。理论分析表明,ATA收敛的速度比通常采用的非自适应对应物更快。从经验上,ATA始终减轻灾难性的过度拟合,并在各种对抗性预算上进行评估时,在CIFAR10,CIFAR100和Imagenet上实现了更高的鲁棒精度。
While adversarial training and its variants have shown to be the most effective algorithms to defend against adversarial attacks, their extremely slow training process makes it hard to scale to large datasets like ImageNet. The key idea of recent works to accelerate adversarial training is to substitute multi-step attacks (e.g., PGD) with single-step attacks (e.g., FGSM). However, these single-step methods suffer from catastrophic overfitting, where the accuracy against PGD attack suddenly drops to nearly 0% during training, destroying the robustness of the networks. In this work, we study the phenomenon from the perspective of training instances. We show that catastrophic overfitting is instance-dependent and fitting instances with larger gradient norm is more likely to cause catastrophic overfitting. Based on our findings, we propose a simple but effective method, Adversarial Training with Adaptive Step size (ATAS). ATAS learns an instancewise adaptive step size that is inversely proportional to its gradient norm. The theoretical analysis shows that ATAS converges faster than the commonly adopted non-adaptive counterparts. Empirically, ATAS consistently mitigates catastrophic overfitting and achieves higher robust accuracy on CIFAR10, CIFAR100 and ImageNet when evaluated on various adversarial budgets.