论文标题
神经机器翻译的令牌下降机制
Token Drop mechanism for Neural Machine Translation
论文作者
论文摘要
具有数百万个参数的神经机器翻译很容易受到不熟悉的输入的影响。我们提出令牌下降以改善概括并避免对NMT模型过度拟合。类似于单词辍学,而我们用特殊令牌代替了掉落的令牌,而不是将零设置为单词。我们进一步介绍了两个自制的目标:替换令牌检测并删除令牌预测。我们的方法旨在强迫模型生成目标翻译的信息较少的信息,以这种方式,模型可以更好地学习文本表示。对中文英语和英国罗马尼亚基准的实验证明了我们方法的有效性,并且我们的模型在强大的变压器基准方面取得了重大改进。
Neural machine translation with millions of parameters is vulnerable to unfamiliar inputs. We propose Token Drop to improve generalization and avoid overfitting for the NMT model. Similar to word dropout, whereas we replace dropped token with a special token instead of setting zero to words. We further introduce two self-supervised objectives: Replaced Token Detection and Dropped Token Prediction. Our method aims to force model generating target translation with less information, in this way the model can learn textual representation better. Experiments on Chinese-English and English-Romanian benchmark demonstrate the effectiveness of our approach and our model achieves significant improvements over a strong Transformer baseline.