捕获显着的历史信息：一种快速准确的非自动回归模型，用于多转弯语言理解

论文标题

捕获显着的历史信息：一种快速准确的非自动回归模型，用于多转弯语言理解

Capture Salient Historical Information: A Fast and Accurate Non-Autoregressive Model for Multi-turn Spoken Language Understanding

论文作者

Cheng, Lizhi, jia, Weijia, Yang, Wenmian

论文摘要

语言理解（SLU）是以任务为导向对话系统的核心组成部分，期望面对人类用户不耐烦的推理较短。现有工作通过为单转弯任务设计非自动回旋模型来提高推理速度，但在面对对话历史记录时未能适用于多转移SLU。直观的想法是使所有历史言论加在一起，并直接利用非自动回忆模型。但是，这种方法严重错过了显着的历史信息，并遭受了不协调的问题。为了克服这些缺点，我们提出了一个新型模型，用于使用层改造的变压器（SHA-LRT），该模型名为“显着历史”，该模型由SHA模块组成，层模块，层修理机制（LRM）和插槽标签（SLG）任务。 SHA通过历史悠久的注意机制捕获了从历史言论和结果进行的当前对话的显着历史信息。 LRM预测了Transformer的中间状态的初步SLU，并利用它们来指导最终预测，而SLG获得了非自动进取编码器的顺序依赖性信息。公共数据集上的实验表明，我们的模型可显着提高多转弯SLU性能（总体上为17.5％），并且加速（接近15倍）最先进的基线推理过程，并且在单端SLU任务上有效。

Spoken Language Understanding (SLU), a core component of the task-oriented dialogue system, expects a shorter inference facing the impatience of human users. Existing work increases inference speed by designing non-autoregressive models for single-turn SLU tasks but fails to apply to multi-turn SLU in confronting the dialogue history. The intuitive idea is to concatenate all historical utterances and utilize the non-autoregressive models directly. However, this approach seriously misses the salient historical information and suffers from the uncoordinated-slot problems. To overcome those shortcomings, we propose a novel model for multi-turn SLU named Salient History Attention with Layer-Refined Transformer (SHA-LRT), which composes of an SHA module, a Layer-Refined Mechanism (LRM), and a Slot Label Generation (SLG) task. SHA captures salient historical information for the current dialogue from both historical utterances and results via a well-designed history-attention mechanism. LRM predicts preliminary SLU results from Transformer's middle states and utilizes them to guide the final prediction, and SLG obtains the sequential dependency information for the non-autoregressive encoder. Experiments on public datasets indicate that our model significantly improves multi-turn SLU performance (17.5% on Overall) with accelerating (nearly 15 times) the inference process over the state-of-the-art baseline as well as effective on the single-turn SLU tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题