从归因地图到通过概念相关性传播的人为理解的解释

论文标题

从归因地图到通过概念相关性传播的人为理解的解释

From Attribution Maps to Human-Understandable Explanations through Concept Relevance Propagation

论文作者

Achtibat, Reduan, Dreyer, Maximilian, Eisenbraun, Ilona, Bosse, Sebastian, Wiegand, Thomas, Samek, Wojciech, Lapuschkin, Sebastian

论文摘要

可解释的人工智能（XAI）的领域旨在为当今强大但不透明的深度学习模型带来透明度。尽管本地XAI方法以归因图的形式解释了个体预测，从而确定了重要特征的发生位置（但没有提供有关其代表的信息），但全局解释技术可视化模型通常学会的编码概念。因此，这两种方法仅提供部分见解，并将解释模型推理的负担留给用户。在这项工作中，我们介绍了概念相关性传播方法（CRP）方法，该方法结合了本地和全球观点，因此允许回答“何处”和“什么”问题的个人预测问题。我们证明了我们在各种环境中方法的能力，表明CRP会导致更具人为解释的解释，并通过概念图谱，概念组成分析以及对概念子空间及其在精细元素决策中的作用进行了深入了解模型的表示和推理。

The field of eXplainable Artificial Intelligence (XAI) aims to bring transparency to today's powerful but opaque deep learning models. While local XAI methods explain individual predictions in form of attribution maps, thereby identifying where important features occur (but not providing information about what they represent), global explanation techniques visualize what concepts a model has generally learned to encode. Both types of methods thus only provide partial insights and leave the burden of interpreting the model's reasoning to the user. In this work we introduce the Concept Relevance Propagation (CRP) approach, which combines the local and global perspectives and thus allows answering both the "where" and "what" questions for individual predictions. We demonstrate the capability of our method in various settings, showcasing that CRP leads to more human interpretable explanations and provides deep insights into the model's representation and reasoning through concept atlases, concept composition analyses, and quantitative investigations of concept subspaces and their role in fine-grained decision making.

下载PDF全文

下载文献需遵守相关版权规定

论文标题