论文标题

多模式对话系统的释义生成和实体提取的数据增强

Data Augmentation with Paraphrase Generation and Entity Extraction for Multimodal Dialogue System

论文作者

Okur, Eda, Sahay, Saurav, Nachman, Lama

论文摘要

通常需要上下文意识到的智能代理人实时了解用户及其周围环境。我们的目标是建立可以帮助儿童学习过程的人工智能(AI)系统。在这样的复杂框架内,口语对话系统(SD)是至关重要的构建基础,可在基于游戏的学习设置中处理有效的与儿童的有效式面向任务的沟通。我们正在为学习基本数学概念的年轻孩子提供多模式对话系统。我们的重点是改善以有限的数据集为导向的SDS管道的自然语言理解(NLU)模块。这项工作探讨了通过释义生成的数据增强的潜在优势,用于在小型任务特定数据集中训练的NLU模型。我们还研究了提取实体对进一步数据扩展的影响。我们已经表明,使用小种子数据使用模型中的模型(MITL)策略释义是一种有前途的方法,可以改善意图识别任务的性能结果。

Contextually aware intelligent agents are often required to understand the users and their surroundings in real-time. Our goal is to build Artificial Intelligence (AI) systems that can assist children in their learning process. Within such complex frameworks, Spoken Dialogue Systems (SDS) are crucial building blocks to handle efficient task-oriented communication with children in game-based learning settings. We are working towards a multimodal dialogue system for younger kids learning basic math concepts. Our focus is on improving the Natural Language Understanding (NLU) module of the task-oriented SDS pipeline with limited datasets. This work explores the potential benefits of data augmentation with paraphrase generation for the NLU models trained on small task-specific datasets. We also investigate the effects of extracting entities for conceivably further data expansion. We have shown that paraphrasing with model-in-the-loop (MITL) strategies using small seed data is a promising approach yielding improved performance results for the Intent Recognition task.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源