深神经网络的建筑解散

论文标题

深神经网络的建筑解散

Architecture Disentanglement for Deep Neural Networks

论文作者

Hu, Jie, Cao, Liujuan, Ye, Qixiang, Tong, Tong, Zhang, ShengChuan, Li, Ke, Huang, Feiyue, Ji, Rongrong, Shao, Ling

论文摘要

了解深神经网络（DNNS）的内部运作对于为实用应用提供值得信赖的人工智能技术至关重要。现有的研究通常涉及将语义概念与DNN的单元或层联系起来，但无法解释推理过程。在本文中，我们介绍了神经建筑解开（NAD）以填补空白。具体而言，NAD学会了根据独立任务将预训练的DNN分解为子体系结构，从而形成描述推理过程的信息流。我们研究了通过在基于对象和基于场景的数据集上使用手工制作和自动搜索的网络体系结构进行的实验，，何处以及如何进行分解。基于实验结果，我们提出了三个新发现，这些发现提供了有关DNN的内部逻辑的新见解。首先，DNN可以分为独立任务的子构造。其次，更深的层并不总是对应于更高的语义。第三，DNN中的连接类型会影响信息跨层流动的方式，从而导致不同的分离行为。有了NAD，我们进一步解释了为什么DNN有时会提供错误的预测。实验结果表明，错误分类的图像具有很高的可能性，即被分配给与正确图像相似的任务子构造。代码将提供：https：//github.com/hujiecpp/nad。

Understanding the inner workings of deep neural networks (DNNs) is essential to provide trustworthy artificial intelligence techniques for practical applications. Existing studies typically involve linking semantic concepts to units or layers of DNNs, but fail to explain the inference process. In this paper, we introduce neural architecture disentanglement (NAD) to fill the gap. Specifically, NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes. We investigate whether, where, and how the disentanglement occurs through experiments conducted with handcrafted and automatically-searched network architectures, on both object-based and scene-based datasets. Based on the experimental results, we present three new findings that provide fresh insights into the inner logic of DNNs. First, DNNs can be divided into sub-architectures for independent tasks. Second, deeper layers do not always correspond to higher semantics. Third, the connection type in a DNN affects how the information flows across layers, leading to different disentanglement behaviors. With NAD, we further explain why DNNs sometimes give wrong predictions. Experimental results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones. Code will be available at: https://github.com/hujiecpp/NAD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题