自我成绩网络

论文标题

自我成绩网络

Self-Gradient Networks

论文作者

Aboutalebi, Hossein, Wong, Mohammad Javad Shafiee Alexander

论文摘要

对抗性攻击对愚弄深层神经网络的令人难以置信的有效性在广泛采用安全和关键性领域的深度学习方面构成了巨大的障碍。自从发现深度神经网络的对抗脆弱性问题以来，已经提出了对抗性防御机制，但有一条漫长的途径可以充分理解和解决这个问题。在这项研究中，我们假设，对抗攻击的不可思议有效性的部分原因是它们具有隐式攻击和利用深神经网络梯度流的能力。这种固有的能力利用梯度流程使防御这种攻击的挑战非常具有挑战性。在这一假设的推动下，我们认为，如果深层神经网络体系结构可以在训练过程中明确利用其自身的梯度流，则可以显着提高其防御能力。受到这一事实的启发，我们介绍了自我成分网络的概念，这是一种新颖的深度神经网络体系结构，旨在对抗对抗性扰动。梯度流信息被利用在自偏网络中，以实现超出标准培训过程中可以实现的更大扰动稳定性。我们进行了理论分析，以更好地了解所提出的自分离网络的行为，以说明利用这种额外的梯度流量信息的功效。提出的自我成绩网络体系结构可以提高效率和有效的对抗性训练，从而使对对抗性强大的解决方案的收敛速度至少为10倍。实验结果表明，与最先进的对抗性学习策略相比，自我成绩网络的有效性，在PGD和CW对抗扰动下，CIFAR10数据集的10％提高了。

The incredible effectiveness of adversarial attacks on fooling deep neural networks poses a tremendous hurdle in the widespread adoption of deep learning in safety and security-critical domains. While adversarial defense mechanisms have been proposed since the discovery of the adversarial vulnerability issue of deep neural networks, there is a long path to fully understand and address this issue. In this study, we hypothesize that part of the reason for the incredible effectiveness of adversarial attacks is their ability to implicitly tap into and exploit the gradient flow of a deep neural network. This innate ability to exploit gradient flow makes defending against such attacks quite challenging. Motivated by this hypothesis we argue that if a deep neural network architecture can explicitly tap into its own gradient flow during the training, it can boost its defense capability significantly. Inspired by this fact, we introduce the concept of self-gradient networks, a novel deep neural network architecture designed to be more robust against adversarial perturbations. Gradient flow information is leveraged within self-gradient networks to achieve greater perturbation stability beyond what can be achieved in the standard training process. We conduct a theoretical analysis to gain better insights into the behaviour of the proposed self-gradient networks to illustrate the efficacy of leverage this additional gradient flow information. The proposed self-gradient network architecture enables much more efficient and effective adversarial training, leading to faster convergence towards an adversarially robust solution by at least 10X. Experimental results demonstrate the effectiveness of self-gradient networks when compared with state-of-the-art adversarial learning strategies, with 10% improvement on the CIFAR10 dataset under PGD and CW adversarial perturbations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题