学会在语音助手中排名

论文标题

学会在语音助手中排名

Learning to Rank Intents in Voice Assistants

论文作者

Anantha, Raviteja, Chappidi, Srinivas, Dawoodi, William

论文摘要

语音助手的目标是通过从其自动化语音识别和自然语言理解子系统生成的多种选项中选择最佳意图来满足用户的要求。但是，语音助手并不总是会产生预期的结果。之所以会发生这种情况，是因为语音助手从模棱两可的意图中进行选择 - 用户特定或特定领域的上下文信息会降低用户请求的歧义。另外，可以利用用户信息状态来了解特定意图对于用户请求的相关/可执行文件。在这项工作中，我们为意图排名任务提出了一个新颖的基于能量的模型，在该模型中，我们学习了一个亲和力指标，并建模了从语音话语中提取的含义与意图的相关性/可执行性方面之间的权衡。此外，我们提出了基于自动编码器的多元化DeNoise debraining，该预训练能够从多个来源学习融合数据的融合表示。我们从经验上表明，我们的方法通过将误差率降低3.8％来优于现有的最新方法，这又减少了歧义并消除了不希望的死端，从而带来了更好的用户体验。最后，我们评估了算法在意图排名任务上的鲁棒性，并显示我们的算法将鲁棒性提高了33.3％。

Voice Assistants aim to fulfill user requests by choosing the best intent from multiple options generated by its Automated Speech Recognition and Natural Language Understanding sub-systems. However, voice assistants do not always produce the expected results. This can happen because voice assistants choose from ambiguous intents - user-specific or domain-specific contextual information reduces the ambiguity of the user request. Additionally the user information-state can be leveraged to understand how relevant/executable a specific intent is for a user request. In this work, we propose a novel Energy-based model for the intent ranking task, where we learn an affinity metric and model the trade-off between extracted meaning from speech utterances and relevance/executability aspects of the intent. Furthermore we present a Multisource Denoising Autoencoder based pretraining that is capable of learning fused representations of data from multiple sources. We empirically show our approach outperforms existing state of the art methods by reducing the error-rate by 3.8%, which in turn reduces ambiguity and eliminates undesired dead-ends leading to better user experience. Finally, we evaluate the robustness of our algorithm on the intent ranking task and show our algorithm improves the robustness by 33.3%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题