积极性：对话代理的亲社会主链

论文标题

积极性：对话代理的亲社会主链

ProsocialDialog: A Prosocial Backbone for Conversational Agents

论文作者

Kim, Hyunwoo, Yu, Youngjae, Jiang, Liwei, Lu, Ximing, Khashabi, Daniel, Kim, Gunhee, Choi, Yejin, Sap, Maarten

论文摘要

大多数现有的对话系统无法通过忽略或被动地同意他们的话语对潜在的不安全用户话语做出正确的反应。为了解决这个问题，我们介绍了prosociaDialog，这是第一个大规模的多转向对话数据集，它教会对话代理在社会规范之后响应有问题的内容。涵盖多样的不道德，有问题，有偏见和有毒的情况，积极性的反应包括鼓励亲社会行为的反应，以常识性的社会规则（即，themb thumb，rots，rots）为基础。 ProsociaDialog通过人类协作框架创建，由58k对话组成，具有331k的话语，160k独特的烂摊子和497K对话安全标签，并伴随着自由形式的理由。借助此数据集，我们引入了对话安全检测模块，即加那利，能够产生腐烂的对话环境，以及一个社会知名的对话代理。经验结果表明，与其他最先进的语言和对话模型相比，Prost在内域和室外设置中产生了更具社会可接受的对话。此外，Canary有效地指导了对话剂和现成的语言模型，以产生更多的亲社会反应。我们的工作强调了创建和转向对话人工智会对社会负责的希望和重要性。

Most existing dialogue systems fail to respond properly to potentially unsafe user utterances by either ignoring or passively agreeing with them. To address this issue, we introduce ProsocialDialog, the first large-scale multi-turn dialogue dataset to teach conversational agents to respond to problematic content following social norms. Covering diverse unethical, problematic, biased, and toxic situations, ProsocialDialog contains responses that encourage prosocial behavior, grounded in commonsense social rules (i.e., rules-of-thumb, RoTs). Created via a human-AI collaborative framework, ProsocialDialog consists of 58K dialogues, with 331K utterances, 160K unique RoTs, and 497K dialogue safety labels accompanied by free-form rationales. With this dataset, we introduce a dialogue safety detection module, Canary, capable of generating RoTs given conversational context, and a socially-informed dialogue agent, Prost. Empirical results show that Prost generates more socially acceptable dialogues compared to other state-of-the-art language and dialogue models in both in-domain and out-of-domain settings. Additionally, Canary effectively guides conversational agents and off-the-shelf language models to generate significantly more prosocial responses. Our work highlights the promise and importance of creating and steering conversational AI to be socially responsible.

下载PDF全文

下载文献需遵守相关版权规定

论文标题