论文标题

用释义生成的快速启动对话系统

Quick Starting Dialog Systems with Paraphrase Generation

论文作者

Marceau, Louis, Belbahar, Raouf, Queudot, Marc, Naji, Nada, Charton, Eric, Meurs, Marie-Jean

论文摘要

获取培训数据以提高对话系统的鲁棒性可能是一个艰苦的漫长过程。在这项工作中,我们提出了一种方法,以使用释义生成从现有示例中人为地生成更多数据来降低创建新的对话代理的成本和精力。我们提出的方法可以以很少的人为努力启动对话系统,并将其绩效提高到一个满足的水平,足以允许与真正的最终用户进行实际互动。我们尝试了两种神经释义方法,即神经机器翻译和一个基于变压器的SEQ2SEQ模型。我们介绍了用英文和法语的两个数据集获得的结果:〜众筹的公共意图分类数据集和我们自己的公司对话框系统数据集。我们表明,我们提出的方法提高了两个数据集上意图分类模型的概括能力,从而减少了初始化新的对话框系统并帮助组织内部规模部署这项技术所需的精力。

Acquiring training data to improve the robustness of dialog systems can be a painstakingly long process. In this work, we propose a method to reduce the cost and effort of creating new conversational agents by artificially generating more data from existing examples, using paraphrase generation. Our proposed approach can kick-start a dialog system with little human effort, and brings its performance to a level satisfactory enough for allowing actual interactions with real end-users. We experimented with two neural paraphrasing approaches, namely Neural Machine Translation and a Transformer-based seq2seq model. We present the results obtained with two datasets in English and in French:~a crowd-sourced public intent classification dataset and our own corporate dialog system dataset. We show that our proposed approach increased the generalization capabilities of the intent classification model on both datasets, reducing the effort required to initialize a new dialog system and helping to deploy this technology at scale within an organization.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源