与非合作用户互动：主动对话策略的新范式

论文标题

与非合作用户互动：主动对话策略的新范式

Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy

论文作者

Lei, Wenqiang, Zhang, Yao, Song, Feifan, Liang, Hongru, Mao, Jiaxin, Lv, Jiancheng, Yang, Zhenglu, Chua, Tat-Seng

论文摘要

积极主动的对话系统能够将对话带到目标主题，并在讨价还价，说服和谈判方面具有优势。当前基于语料库的学习方式限制了其在现实情况下的实际应用。为此，我们为将主动的对话政策的研究推向了更自然和具有挑战性的环境，即与用户动态互动。此外，我们引起人们对非合作用户行为的关注 - 当用户对代理商引入的先前主题不满意时，用户讨论了分开的主题。我们认为，快速达到目标主题并保持较高的用户满意度的目标并不总是会收敛，因为接近目标的主题和用户更喜欢的主题可能并不相同。对于这个问题，我们提出了一个名为I-Pro的新解决方案，该解决方案可以在交互式环境中学习主动的策略。具体来说，我们通过学习的目标重量来学习权衡，这包括四个因素（对话转弯，目标完成难度，用户满意度估计和合作学位）。实验结果表明，在有效性和解释性方面，I-Pro显着优于基准。

Proactive dialogue system is able to lead the conversation to a goal topic and has advantaged potential in bargain, persuasion and negotiation. Current corpus-based learning manner limits its practical application in real-world scenarios. To this end, we contribute to advance the study of the proactive dialogue policy to a more natural and challenging setting, i.e., interacting dynamically with users. Further, we call attention to the non-cooperative user behavior -- the user talks about off-path topics when he/she is not satisfied with the previous topics introduced by the agent. We argue that the targets of reaching the goal topic quickly and maintaining a high user satisfaction are not always converge, because the topics close to the goal and the topics user preferred may not be the same. Towards this issue, we propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting. Specifically, we learn the trade-off via a learned goal weight, which consists of four factors (dialogue turn, goal completion difficulty, user satisfaction estimation, and cooperative degree). The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题