论文标题
分析深神经网络的噪声稳健性
Analyzing the Noise Robustness of Deep Neural Networks
论文作者
论文摘要
通过在正常示例中添加小但有意察觉到的扰动而产生的对抗性示例可能会误导深度神经网络(DNN)来做出错误的预测。尽管在对抗性攻击和防御方面已经做了很多工作,但仍然缺乏对对抗性例子的细粒度的理解。为了解决这个问题,我们提出了一种视觉分析方法,以解释为什么对抗性示例被错误分类。关键是要比较和分析对抗性和正常示例的数据。 DataPath是一组关键神经元及其连接。我们将数据提取作为子集选择问题,并通过构建和培训神经网络来解决它。由网络级别可视化数据流的可视化,特征图的层级可视化以及学到的特征的神经元级可视化组成的多层次可视化,旨在帮助研究预测过程中对抗性和正常示例的数据。进行了定量评估和案例研究,以证明我们的方法解释了对抗性例子的错误分类。
Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions. Although much work has been done on both adversarial attack and defense, a fine-grained understanding of adversarial examples is still lacking. To address this issue, we present a visual analysis method to explain why adversarial examples are misclassified. The key is to compare and analyze the datapaths of both the adversarial and normal examples. A datapath is a group of critical neurons along with their connections. We formulate the datapath extraction as a subset selection problem and solve it by constructing and training a neural network. A multi-level visualization consisting of a network-level visualization of data flows, a layer-level visualization of feature maps, and a neuron-level visualization of learned features, has been designed to help investigate how datapaths of adversarial and normal examples diverge and merge in the prediction process. A quantitative evaluation and a case study were conducted to demonstrate the promise of our method to explain the misclassification of adversarial examples.