扭曲的语言模型可噪音强大的语言理解

论文标题

扭曲的语言模型可噪音强大的语言理解

Warped Language Models for Noise Robust Language Understanding

论文作者

Namazifar, Mahdi, Tur, Gokhan, Tür, Dilek Hakkani

论文摘要

蒙版语言模型（MLM）是经过训练的自制神经网络，可以用蒙版令牌填充给定句子的空白。尽管MLM在各种基于文本的任务方面取得了巨大的成功，但它们对口语理解并不强大，尤其是对于自发的对话语音识别噪声。在这项工作中，我们介绍了扭曲的语言模型（WLM），其中训练时间的输入句子进行了与MLM中相同的修改，以及两个其他修改，即插入和删除随机令牌。除了MLMS中的修改之外，这两种修改扩展并收缩了句子，因此名称为“扭曲”一词。 WLM训练期间输入文本的插入和下降修改类似于由于自动语音识别（ASR）错误引起的噪声类型，因此WLM可能对ASR噪声更强大。通过计算结果，我们表明，与基于MLM构建的自然语言理解系统相比，构建在WLM的顶部，尤其是在ASR错误存在下。

Masked Language Models (MLM) are self-supervised neural networks trained to fill in the blanks in a given sentence with masked tokens. Despite the tremendous success of MLMs for various text based tasks, they are not robust for spoken language understanding, especially for spontaneous conversational speech recognition noise. In this work we introduce Warped Language Models (WLM) in which input sentences at training time go through the same modifications as in MLM, plus two additional modifications, namely inserting and dropping random tokens. These two modifications extend and contract the sentence in addition to the modifications in MLMs, hence the word "warped" in the name. The insertion and drop modification of the input text during training of WLM resemble the types of noise due to Automatic Speech Recognition (ASR) errors, and as a result WLMs are likely to be more robust to ASR noise. Through computational results we show that natural language understanding systems built on top of WLMs perform better compared to those built based on MLMs, especially in the presence of ASR errors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题