论文标题
探索神经机器翻译中转移学习的好处
Exploring Benefits of Transfer Learning in Neural Machine Translation
论文作者
论文摘要
已知神经机器翻译需要大量的平行训练句子,这通常阻止其在低资源语言对上表现出色。本文探讨了在神经网络上使用跨语性转移学习的用途,以解决缺乏资源的问题。我们提出了几种转移学习方法,以重用在高资源语言对上预定的模型。我们特别注意这些技术的简单性。我们研究了两种情况:(a)当我们重复使用高资源模型而没有对其培训过程进行任何事先修改时,以及(b)当我们可以准备提前转移学习的第一阶段高资源模型时。对于以前的情况,我们通过重用其他研究人员培训的模型来提出一种概念验证方法。在后一种情况下,我们提出了一种方法,该方法在翻译性能方面取得了更大的改进。除了提出的技术外,我们还专注于对转移学习技术的深入分析,并试图阐明转移学习的改进。我们展示了我们的技术如何解决低资源语言的特定问题,即使在高资源转移学习中也是合适的。我们通过在各种情况下(例如人为损坏的培训语料库或固定的各种模型零件)研究转移学习来评估潜在的缺点和行为。
Neural machine translation is known to require large numbers of parallel training sentences, which generally prevent it from excelling on low-resource language pairs. This thesis explores the use of cross-lingual transfer learning on neural networks as a way of solving the problem with the lack of resources. We propose several transfer learning approaches to reuse a model pretrained on a high-resource language pair. We pay particular attention to the simplicity of the techniques. We study two scenarios: (a) when we reuse the high-resource model without any prior modifications to its training process and (b) when we can prepare the first-stage high-resource model for transfer learning in advance. For the former scenario, we present a proof-of-concept method by reusing a model trained by other researchers. In the latter scenario, we present a method which reaches even larger improvements in translation performance. Apart from proposed techniques, we focus on an in-depth analysis of transfer learning techniques and try to shed some light on transfer learning improvements. We show how our techniques address specific problems of low-resource languages and are suitable even in high-resource transfer learning. We evaluate the potential drawbacks and behavior by studying transfer learning in various situations, for example, under artificially damaged training corpora, or with fixed various model parts.