论文标题
NPC:通过表征深神经网络的决策逻辑的神经元路径覆盖范围
NPC: Neuron Path Coverage via Characterizing Decision Logic of Deep Neural Networks
论文作者
论文摘要
最近,深度学习已被广泛应用于不同领域的许多应用程序,例如图像分类和音频识别。但是,深神经网络(DNN)的质量仍然引起了实际操作环境中的关注,该操作环境要求进行系统的测试,尤其是在安全至关重要的情况下。受到软件测试的启发,设计并提出了许多结构覆盖标准,以衡量DNN的测试充分性。但是,由于DNN的黑箱性质,现有的结构覆盖标准很难解释,因此很难理解这些标准的基本原则。结构覆盖范围与DNN的决策逻辑之间的关系尚不清楚。此外,最近的研究进一步揭示了结构覆盖范围和DNN缺陷检测之间的相关性不存在,这进一步发布了有关合适的DNN测试标准的关注。 在本文中,我们通过构建DNN的决策结构提出了可解释的覆盖标准。反映传统程序的控制流程图,我们首先根据其解释从DNN中提取决策图,其中决策图的路径代表DNN的决策逻辑。根据决策图的控制流和数据流,我们提出了两个路径覆盖的变体,以测量测试用例在行使决策逻辑方面的适当性。路径覆盖范围越高,预计DNN的决策逻辑越多样化。我们的大规模评估结果表明:决策图中的路径有效地表征了DNN的决策,并且提出的覆盖标准也对包括自然错误和对抗性示例在内的错误敏感,并且与输出公正性密切相关。
Deep learning has recently been widely applied to many applications across different domains, e.g., image classification and audio recognition. However, the quality of Deep Neural Networks (DNNs) still raises concerns in the practical operational environment, which calls for systematic testing, especially in safety-critical scenarios. Inspired by software testing, a number of structural coverage criteria are designed and proposed to measure the test adequacy of DNNs. However, due to the blackbox nature of DNN, the existing structural coverage criteria are difficult to interpret, making it hard to understand the underlying principles of these criteria. The relationship between the structural coverage and the decision logic of DNNs is unknown. Moreover, recent studies have further revealed the non-existence of correlation between the structural coverage and DNN defect detection, which further posts concerns on what a suitable DNN testing criterion should be. In this paper, we propose the interpretable coverage criteria through constructing the decision structure of a DNN. Mirroring the control flow graph of the traditional program, we first extract a decision graph from a DNN based on its interpretation, where a path of the decision graph represents a decision logic of the DNN. Based on the control flow and data flow of the decision graph, we propose two variants of path coverage to measure the adequacy of the test cases in exercising the decision logic. The higher the path coverage, the more diverse decision logic the DNN is expected to be explored. Our large-scale evaluation results demonstrate that: the path in the decision graph is effective in characterizing the decision of the DNN, and the proposed coverage criteria are also sensitive with errors including natural errors and adversarial examples, and strongly correlated with the output impartiality.