论文标题

与非合作用户互动:主动对话策略的新范式

Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy

论文作者

Lei, Wenqiang, Zhang, Yao, Song, Feifan, Liang, Hongru, Mao, Jiaxin, Lv, Jiancheng, Yang, Zhenglu, Chua, Tat-Seng

论文摘要

积极主动的对话系统能够将对话带到目标主题,并在讨价还价,说服和谈判方面具有优势。当前基于语料库的学习方式限制了其在现实情况下的实际应用。为此,我们为将主动的对话政策的研究推向了更自然和具有挑战性的环境,即与用户动态互动。此外,我们引起人们对非合作用户行为的关注 - 当用户对代理商引入的先前主题不满意时,用户讨论了分开的主题。我们认为,快速达到目标主题并保持较高的用户满意度的目标并不总是会收敛,因为接近目标的主题和用户更喜欢的主题可能并不相同。对于这个问题,我们提出了一个名为I-Pro的新解决方案,该解决方案可以在交互式环境中学习主动的策略。具体来说,我们通过学习的目标重量来学习权衡,这包括四个因素(对话转弯,目标完成难度,用户满意度估计和合作学位)。实验结果表明,在有效性和解释性方面,I-Pro显着优于基准。

Proactive dialogue system is able to lead the conversation to a goal topic and has advantaged potential in bargain, persuasion and negotiation. Current corpus-based learning manner limits its practical application in real-world scenarios. To this end, we contribute to advance the study of the proactive dialogue policy to a more natural and challenging setting, i.e., interacting dynamically with users. Further, we call attention to the non-cooperative user behavior -- the user talks about off-path topics when he/she is not satisfied with the previous topics introduced by the agent. We argue that the targets of reaching the goal topic quickly and maintaining a high user satisfaction are not always converge, because the topics close to the goal and the topics user preferred may not be the same. Towards this issue, we propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting. Specifically, we learn the trade-off via a learned goal weight, which consists of four factors (dialogue turn, goal completion difficulty, user satisfaction estimation, and cooperative degree). The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源