显着卡：表征和比较显着性方法的框架

论文标题

显着卡：表征和比较显着性方法的框架

Saliency Cards: A Framework to Characterize and Compare Saliency Methods

论文作者

Boggust, Angie, Suresh, Harini, Strobelt, Hendrik, Guttag, John V., Satyanarayan, Arvind

论文摘要

显着性方法是一类通用的机器学习可解释性技术，可以计算每个输入功能对模型输出的重要性。我们发现，随着发展的迅速速度，用户努力了解新方法的优势和局限性，因此出于无原则的原因（例如，受欢迎程度）选择方法。此外，尽管评估指标的相应增加，但现有的方法假设了普遍的逃避显着性方法（例如，忠诚），这些方法无法解决多样化的用户需求。作为回应，我们介绍了显着卡：显着性方法如何运作及其在一系列评估指标中的性能的结构化文档。通过对25种显着性方法论文和33种方法评估的审查，我们确定了10个属性，在选择方法时应考虑这些属性。我们将这些属性分为跨越计算和解释显着性过程的三类：方法论或如何计算显着性；敏感性，或显着性与基本模型和数据之间的关系；并且，可感知或最终用户最终如何解释结果。通过整理这些信息，显着卡允许用户更整体评估和比较不同方法的含义。通过与来自各种背景的用户（包括研究人员，放射科医生和计算生物学家）的九次半结构化访谈，我们发现显着卡为讨论单个方法提供了详细的词汇，并允许更系统地选择适合任务的方法。此外，借助显着卡，我们能够以更具结构化的方式分析研究格局，以确定未满足用户需求的新方法和评估指标的机会。

Saliency methods are a common class of machine learning interpretability techniques that calculate how important each input feature is to a model's output. We find that, with the rapid pace of development, users struggle to stay informed of the strengths and limitations of new methods and, thus, choose methods for unprincipled reasons (e.g., popularity). Moreover, despite a corresponding rise in evaluation metrics, existing approaches assume universal desiderata for saliency methods (e.g., faithfulness) that do not account for diverse user needs. In response, we introduce saliency cards: structured documentation of how saliency methods operate and their performance across a battery of evaluative metrics. Through a review of 25 saliency method papers and 33 method evaluations, we identify 10 attributes that users should account for when choosing a method. We group these attributes into three categories that span the process of computing and interpreting saliency: methodology, or how the saliency is calculated; sensitivity, or the relationship between the saliency and the underlying model and data; and, perceptibility, or how an end user ultimately interprets the result. By collating this information, saliency cards allow users to more holistically assess and compare the implications of different methods. Through nine semi-structured interviews with users from various backgrounds, including researchers, radiologists, and computational biologists, we find that saliency cards provide a detailed vocabulary for discussing individual methods and allow for a more systematic selection of task-appropriate methods. Moreover, with saliency cards, we are able to analyze the research landscape in a more structured fashion to identify opportunities for new methods and evaluation metrics for unmet user needs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题