道德完整性语料库：道德对话系统的基准

论文标题

道德完整性语料库：道德对话系统的基准

The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems

论文作者

Ziems, Caleb, Yu, Jane A., Wang, Yi-Chia, Halevy, Alon, Yang, Diyi

论文摘要

在开放域对话环境中，会话代理人越来越接近人类的能力。但是，这样的模型可以反映不敏感，伤害或完全不一致的观点，这些观点削弱了用户对系统道德完整性的信任。道德偏差很难缓解，因为道德判断不是普遍的，并且可能同时适用于这种情况。在这项工作中，我们引入了一种新资源，而不是为了解决道德歧义，而是为了促进对对话系统的话语所反映的直觉，价值和道德判断的系统性理解。道德完整性语料库麦克风是这样的资源，它使用99k不同的拇指规则（ROTS）捕获了38K及时迅速对的道德假设。每个腐烂都反映了一种特殊的道德信念，可以解释为什么聊天机器人的答复可能看起来可以接受或有问题。我们通过一组9个道德和社会属性以及属性分类的基准表现来进一步组织腐烂。最重要的是，我们表明当前的神经语言模型可以自动产生新的腐烂，从而合理地描述了以前看不见的互动，但它们仍然在某些情况下挣扎。我们的发现表明，麦克风将是理解和语言模型的隐性道德假设的有用资源，并灵活地基于对话代理的完整性。要下载数据，请参见https://github.com/gt-salt/mic

Conversational agents have come increasingly closer to human competence in open-domain dialogue settings; however, such models can reflect insensitive, hurtful, or entirely incoherent viewpoints that erode a user's trust in the moral integrity of the system. Moral deviations are difficult to mitigate because moral judgments are not universal, and there may be multiple competing judgments that apply to a situation simultaneously. In this work, we introduce a new resource, not to authoritatively resolve moral ambiguities, but instead to facilitate systematic understanding of the intuitions, values and moral judgments reflected in the utterances of dialogue systems. The Moral Integrity Corpus, MIC, is such a resource, which captures the moral assumptions of 38k prompt-reply pairs, using 99k distinct Rules of Thumb (RoTs). Each RoT reflects a particular moral conviction that can explain why a chatbot's reply may appear acceptable or problematic. We further organize RoTs with a set of 9 moral and social attributes and benchmark performance for attribute classification. Most importantly, we show that current neural language models can automatically generate new RoTs that reasonably describe previously unseen interactions, but they still struggle with certain scenarios. Our findings suggest that MIC will be a useful resource for understanding and language models' implicit moral assumptions and flexibly benchmarking the integrity of conversational agents. To download the data, see https://github.com/GT-SALT/mic

下载PDF全文

下载文献需遵守相关版权规定

论文标题