你为什么这么认为？在没有监督的情况下探索忠实的句子级别的理由

论文标题

你为什么这么认为？在没有监督的情况下探索忠实的句子级别的理由

Why do you think that? Exploring Faithful Sentence-Level Rationales Without Supervision

论文作者

Glockner, Max, Habernal, Ivan, Gurevych, Iryna

论文摘要

评估模型预测的可信赖性对于区分“正确原因”和“出于错误的原因”至关重要。识别确定目标标签（称为忠实理由）的文本跨度通常依赖于管道方法或增强学习。但是，这种方法要么需要监督，因此需要对理由进行昂贵的注释，或者采用非差异模型。我们提出了一个可区分的培训框架，以创建模型，通过仅对目标任务进行监督，以句子级别输出忠实的理由。为了实现这一目标，我们的模型基于每个理由分别解决任务，并学会为解决任务最好的分数分配高分。与标准的BERT BlackBox相比，我们在三个不同数据集上的评估显示出竞争性结果，同时在两种情况下超过了管道的性能。我们进一步利用这些模型的透明决策过程，更喜欢通过应用直接监督来选择正确的理由，从而提高理由级别的绩效。

Evaluating the trustworthiness of a model's prediction is essential for differentiating between `right for the right reasons' and `right for the wrong reasons'. Identifying textual spans that determine the target label, known as faithful rationales, usually relies on pipeline approaches or reinforcement learning. However, such methods either require supervision and thus costly annotation of the rationales or employ non-differentiable models. We propose a differentiable training-framework to create models which output faithful rationales on a sentence level, by solely applying supervision on the target task. To achieve this, our model solves the task based on each rationale individually and learns to assign high scores to those which solved the task best. Our evaluation on three different datasets shows competitive results compared to a standard BERT blackbox while exceeding a pipeline counterpart's performance in two cases. We further exploit the transparent decision-making process of these models to prefer selecting the correct rationales by applying direct supervision, thereby boosting the performance on the rationale-level.

下载PDF全文

下载文献需遵守相关版权规定

论文标题