显着图言语：比较基于无模型和基于指令的方法的特征重要性表示

论文标题

显着图言语：比较基于无模型和基于指令的方法的特征重要性表示

Saliency Map Verbalization: Comparing Feature Importance Representations from Model-free and Instruction-based Methods

论文作者

Feldhus, Nils, Hennig, Leonhard, Nasert, Maximilian Dustin, Ebert, Christopher, Schwarzenberg, Robert, Möller, Sebastian

论文摘要

显着图可以通过识别重要的输入特征来解释神经模型的预测。它们很难为外行人解释，尤其是对于具有许多功能的实例。为了使它们更容易访问，我们将不充分的任务正式化了将显着性图转换为自然语言的任务，并比较解决这种方法的两个关键挑战的方法 - 什么以及如何言语。在自动和人类评估设置中，使用文本分类任务的令牌级属性，我们将两种新颖的方法（基于搜索和基于教学的语言）与常规特征重要性表示（热图可视化和提取性理由），测量模拟性，忠诚，忠诚，有助于理解和理解的理解。指示GPT-3.5产生显着性地图的口头化会产生合理的解释，其中包括关联，抽象性摘要和常识性推理，这是迄今为止人类评级的最高评级，但他们并没有忠实地捕获数字信息，并且在对任务的解释中是不一致的。相比之下，我们基于搜索的，无模型的言语方法有效地完成了模板的言语化，这是由设计忠实的，但在帮助和相似性方面缺乏。我们的结果表明，显着性言语使特征归因解释比传统表示更容易理解，并且对人类的认知挑战更少。

Saliency maps can explain a neural model's predictions by identifying important input features. They are difficult to interpret for laypeople, especially for instances with many features. In order to make them more accessible, we formalize the underexplored task of translating saliency maps into natural language and compare methods that address two key challenges of this approach -- what and how to verbalize. In both automatic and human evaluation setups, using token-level attributions from text classification tasks, we compare two novel methods (search-based and instruction-based verbalizations) against conventional feature importance representations (heatmap visualizations and extractive rationales), measuring simulatability, faithfulness, helpfulness and ease of understanding. Instructing GPT-3.5 to generate saliency map verbalizations yields plausible explanations which include associations, abstractive summarization and commonsense reasoning, achieving by far the highest human ratings, but they are not faithfully capturing numeric information and are inconsistent in their interpretation of the task. In comparison, our search-based, model-free verbalization approach efficiently completes templated verbalizations, is faithful by design, but falls short in helpfulness and simulatability. Our results suggest that saliency map verbalization makes feature attribution explanations more comprehensible and less cognitively challenging to humans than conventional representations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题