重新思考文档级神经机器翻译

论文标题

重新思考文档级神经机器翻译

Rethinking Document-level Neural Machine Translation

论文作者

Sun, Zewei, Wang, Mingxuan, Zhou, Hao, Zhao, Chengqi, Huang, Shujian, Chen, Jiajun, Li, Lei

论文摘要

本文不是旨在引入用于文档级神经机器翻译的新型模型。取而代之的是，我们回到原始的变压器模型，并希望回答以下问题：当前模型的容量是否足以用于文档级翻译？有趣的是，我们观察到，具有适当训练技术的原始变压器即使有2000个单词的长度也可以实现文档翻译的强劲结果。我们评估了该模型，并在六个语言的九个文档级数据集和两个句子级数据集上评估了几种方法。实验表明，文档级变压器模型在一组全面的指标中优于句子级和许多以前的方法，包括BLEU，四个词汇指数，三个新提出的助手语言指标和人类评估。

This paper does not aim at introducing a novel model for document-level neural machine translation. Instead, we head back to the original Transformer model and hope to answer the following question: Is the capacity of current models strong enough for document-level translation? Interestingly, we observe that the original Transformer with appropriate training techniques can achieve strong results for document translation, even with a length of 2000 words. We evaluate this model and several recent approaches on nine document-level datasets and two sentence-level datasets across six languages. Experiments show that document-level Transformer models outperforms sentence-level ones and many previous methods in a comprehensive set of metrics, including BLEU, four lexical indices, three newly proposed assistant linguistic indicators, and human evaluation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题