改善大规模多语言神经机器翻译和零拍的翻译

论文标题

改善大规模多语言神经机器翻译和零拍的翻译

Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation

论文作者

Zhang, Biao, Williams, Philip, Titov, Ivan, Sennrich, Rico

论文摘要

关于神经机器翻译（NMT）的大量多语言模型在理论上是有吸引力的，但通常表现不佳的双语模型，并提供了零拍的不良翻译。在本文中，我们探讨了改善它们的方法。我们认为，多语言NMT需要更强的建模能力来支持具有不同类型学特征的语言对，并通过特定于语言的组件克服这种瓶颈并加深NMT体系结构。我们将脱离目标翻译问题（即翻译成错误的目标语言）确定为零拍摄性能的主要来源，并提出随机的在线反向翻译以强制执行未见培训语言对的翻译。 Opus-100的实验（一种具有100种语言的新型多语言数据集）表明，我们的方法在一到一到多的一对多对数字中都通过双语模型缩小了性能差距，并通过〜10 BLEU提高了零击性能，以实现基于常规Pivot的传统方法。

Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations. In this paper, we explore ways to improve them. We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. We identify the off-target translation issue (i.e. translating into a wrong target language) as the major source of the inferior zero-shot performance, and propose random online backtranslation to enforce the translation of unseen training language pairs. Experiments on OPUS-100 (a novel multilingual dataset with 100 languages) show that our approach substantially narrows the performance gap with bilingual models in both one-to-many and many-to-many settings, and improves zero-shot performance by ~10 BLEU, approaching conventional pivot-based methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题