$ n $ gram回来：$ n $ gram语言模型的神经文本生成的残差学习

论文标题

$ n $ gram回来：$ n $ gram语言模型的神经文本生成的残差学习

$N$-gram Is Back: Residual Learning of Neural Text Generation with $n$-gram Language Model

论文作者

Li, Huayang, Cai, Deng, Xu, Jin, Watanabe, Taro

论文摘要

$ n $ - 语言模型（LM）在很大程度上被神经LMS取代，因为后者表现出更好的性能。但是，我们发现$ n $ gram模型可以在很大一部分的测试案例中实现令人满意的性能，这表明他们已经捕获了对语言的丰富知识，其计算成本相对较低。通过此观察，我们建议学习一个适合$ n $ gram lm和Real-Data分布之间残差的神经LM。 $ n $ gram和神经LMS的组合不仅使神经部分专注于对语言的更深入的了解，而且还提供了一种灵活的方法来通过切换基础$ n $ gram模型而无需更改神经模型来定制LM。对三个典型语言任务（即语言建模，机器翻译和摘要）的实验结果表明，我们的方法始终如一地对流行的独立神经模型获得了额外的性能增长。我们还表明，我们的方法可以通过简单地切换到特定于域的$ n $ gram模型，而无需任何额外的培训，从而可以进行有效的域适应。我们的代码在https://github.com/ghrua/ngramres上发布。

$N$-gram language models (LM) have been largely superseded by neural LMs as the latter exhibits better performance. However, we find that $n$-gram models can achieve satisfactory performance on a large proportion of testing cases, indicating they have already captured abundant knowledge of the language with relatively low computational cost. With this observation, we propose to learn a neural LM that fits the residual between an $n$-gram LM and the real-data distribution. The combination of $n$-gram and neural LMs not only allows the neural part to focus on the deeper understanding of language but also provides a flexible way to customize an LM by switching the underlying $n$-gram model without changing the neural model. Experimental results on three typical language tasks (i.e., language modeling, machine translation, and summarization) demonstrate that our approach attains additional performance gains over popular standalone neural models consistently. We also show that our approach allows for effective domain adaptation by simply switching to a domain-specific $n$-gram model, without any extra training. Our code is released at https://github.com/ghrua/NgramRes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题