通过潜在的视觉语义过滤器来解释深卷积神经网络

论文标题

通过潜在的视觉语义过滤器来解释深卷积神经网络

Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention

论文作者

Yang, Yu, Kim, Seungbae, Joo, Jungseock

论文摘要

可解释性是视觉模型的重要特性，因为它可以帮助研究人员和用户了解复杂模型的内部机制。但是，在没有直接监督以产生这种解释的情况下，生成有关学习代表的语义解释是具有挑战性的。我们提出了一个通用框架，即潜在的视觉语义解释器（Lavise），以教导任何现有的卷积神经网络，以在滤波器级别生成有关其自身潜在表示的文本描述。我们的方法使用图像和类别名称使用通用图像数据集构建视觉和语义空间之间的映射。然后，它将映射转移到没有语义标签的目标域。所提出的框架采用模块化结构，并能够分析任何训练有素的网络是否可用。我们表明，我们的方法可以为训练数据集中定义的一组类别生成新的描述，并在多个数据集上进行广泛的评估。我们还展示了我们在无监督数据集偏置分析中的新颖应用，该方法使我们能够自动发现数据集中的隐藏偏见或在不使用其他标签的情况下比较不同的子集。数据集和代码公开以促进进一步的研究。

Interpretability is an important property for visual models as it helps researchers and users understand the internal mechanism of a complex model. However, generating semantic explanations about the learned representation is challenging without direct supervision to produce such explanations. We propose a general framework, Latent Visual Semantic Explainer (LaViSE), to teach any existing convolutional neural network to generate text descriptions about its own latent representations at the filter level. Our method constructs a mapping between the visual and semantic spaces using generic image datasets, using images and category names. It then transfers the mapping to the target domain which does not have semantic labels. The proposed framework employs a modular structure and enables to analyze any trained network whether or not its original training data is available. We show that our method can generate novel descriptions for learned filters beyond the set of categories defined in the training dataset and perform an extensive evaluation on multiple datasets. We also demonstrate a novel application of our method for unsupervised dataset bias analysis which allows us to automatically discover hidden biases in datasets or compare different subsets without using additional labels. The dataset and code are made public to facilitate further research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题