论文标题

解释:按标准组织归因方法

Interpreting Interpretations: Organizing Attribution Methods by Criteria

论文作者

Wang, Zifan, Mardziel, Piotr, Datta, Anupam, Fredrikson, Matt

论文摘要

受到不同相关标准的动机,已经开发出越来越多的归因方法来解释深度学习。尽管每个都依赖于“重要性”概念的解释性以及我们可视化模式的能力,但这些方法产生的解释往往有所不同。结果,视觉模型的输入归因无法提供对模型行为的任何水平的理解。在这项工作中,我们扩大了人类理解的概念的基础,这些概念的属性可以被解释超出“重要性”及其可视化;我们结合了必要性和不足性的逻辑概念以及相称性的概念。我们认为将这些概念表示为归因的定量方面。这使我们能够比较由不同的方法所产生的属性并在Noverways中解释它们:此归因(或此方法)在多大程度上代表了突出显示的输入的必要性或充分性,以及它在多大程度上成比例的?我们评估了有关用于图像分类的卷积神经网络(CNN)的方法集合的措施。我们得出的结论是,某些归因方法更适合于必要性方面的解释,而其他归因方法则在充分性方面,而在两者方面始终没有方法最合适。

Motivated by distinct, though related, criteria, a growing number of attribution methods have been developed tointerprete deep learning. While each relies on the interpretability of the concept of "importance" and our ability to visualize patterns, explanations produced by the methods often differ. As a result, input attribution for vision models fail to provide any level of human understanding of model behaviour. In this work we expand the foundationsof human-understandable concepts with which attributionscan be interpreted beyond "importance" and its visualization; we incorporate the logical concepts of necessity andsufficiency, and the concept of proportionality. We definemetrics to represent these concepts as quantitative aspectsof an attribution. This allows us to compare attributionsproduced by different methods and interpret them in novelways: to what extent does this attribution (or this method)represent the necessity or sufficiency of the highlighted inputs, and to what extent is it proportional? We evaluate our measures on a collection of methods explaining convolutional neural networks (CNN) for image classification. We conclude that some attribution methods are more appropriate for interpretation in terms of necessity while others are in terms of sufficiency, while no method is always the most appropriate in terms of both.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源