论文标题
为什么很难对Relu网络进行对抗培训?
Why Adversarial Training of ReLU Networks Is Difficult?
论文作者
论文摘要
本文在数学上得出了RELU网络上对抗性扰动的分析解决方案,理论上解释了对抗性训练的困难。具体而言,我们制定了由多步攻击产生的对抗性扰动的动力学,这表明对抗性扰动倾向于增强与损失W.R.T. Hessian矩阵的几个顶级特征值相对应的特征向量。输入。我们还证明,对抗训练倾向于以指数级的方式加强具有较大梯度规范的不自信输入样本的影响。此外,我们发现对抗性训练增强了损失W.R.T. Hessian Matrix的影响。网络参数,使对抗性训练更有可能沿着几个样本的指示振荡,并增加了对抗训练的困难。至关重要的是,我们的证明为以前的发现对抗训练提供了统一的解释。
This paper mathematically derives an analytic solution of the adversarial perturbation on a ReLU network, and theoretically explains the difficulty of adversarial training. Specifically, we formulate the dynamics of the adversarial perturbation generated by the multi-step attack, which shows that the adversarial perturbation tends to strengthen eigenvectors corresponding to a few top-ranked eigenvalues of the Hessian matrix of the loss w.r.t. the input. We also prove that adversarial training tends to strengthen the influence of unconfident input samples with large gradient norms in an exponential manner. Besides, we find that adversarial training strengthens the influence of the Hessian matrix of the loss w.r.t. network parameters, which makes the adversarial training more likely to oscillate along directions of a few samples, and boosts the difficulty of adversarial training. Crucially, our proofs provide a unified explanation for previous findings in understanding adversarial training.