论文标题

神经开放域对话框系统在对话历史记录中是否对语音识别错误有鲁棒性?一项实证研究

Are Neural Open-Domain Dialog Systems Robust to Speech Recognition Errors in the Dialog History? An Empirical Study

论文作者

Gopalakrishnan, Karthik, Hedayatnia, Behnam, Wang, Longshaokan, Liu, Yang, Hakkani-Tur, Dilek

论文摘要

大型端到端神经开放域聊天机器人越来越受欢迎。但是,建立此类聊天机器人的研究通常假设用户输入是在本质上编写的,尚不清楚这些聊天机器人是否会与自动语音识别(ASR)模型无缝集成以服务语音方式。我们的目标是通过经验研究对话历史中各种类型的合成和实际ASR假设对TransferTransfo的影响,这是一种最先进的生成预训练的预训练的变压器(GPT)基于Neurips Converips Convai2挑战的效果。我们观察到,经过书面数据培训的Transfertransfo对推理期间对话框历史记录引入的此类假设非常敏感。作为一种基准缓解策略,我们在训练过程中对对话历史进行了综合ASR假设,并观察到边际改进,这表明需要对技术进行进一步研究,以使端到端的开放式聊天机器人完全持言论。据我们所知,这是第一项评估合成和实际ASR假设对最新神经开放式对话系统的影响的研究,我们希望它能在开放域中促进语音固定性作为评估标准。

Large end-to-end neural open-domain chatbots are becoming increasingly popular. However, research on building such chatbots has typically assumed that the user input is written in nature and it is not clear whether these chatbots would seamlessly integrate with automatic speech recognition (ASR) models to serve the speech modality. We aim to bring attention to this important question by empirically studying the effects of various types of synthetic and actual ASR hypotheses in the dialog history on TransferTransfo, a state-of-the-art Generative Pre-trained Transformer (GPT) based neural open-domain dialog system from the NeurIPS ConvAI2 challenge. We observe that TransferTransfo trained on written data is very sensitive to such hypotheses introduced to the dialog history during inference time. As a baseline mitigation strategy, we introduce synthetic ASR hypotheses to the dialog history during training and observe marginal improvements, demonstrating the need for further research into techniques to make end-to-end open-domain chatbots fully speech-robust. To the best of our knowledge, this is the first study to evaluate the effects of synthetic and actual ASR hypotheses on a state-of-the-art neural open-domain dialog system and we hope it promotes speech-robustness as an evaluation criterion in open-domain dialog.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源