TSMIND：阿里巴巴和索哥大学提交WMT22翻译建议任务

论文标题

TSMIND：阿里巴巴和索哥大学提交WMT22翻译建议任务

TSMind: Alibaba and Soochow University's Submission to the WMT22 Translation Suggestion Task

论文作者

Ge, Xin, Wang, Ke, Wang, Jiayi, Xiao, Nini, Duan, Xiangyu, Zhao, Yu, Zhang, Yuqi

论文摘要

本文介绍了阿里巴巴和索哥大学TSMind的联合提交给WMT 2022翻译建议（TS）的共同任务。我们参加了英语 - 德国和英语任务。基本上，我们基于大规模预培训模型的下游任务，利用模型范式微调，该模型最近取得了巨大的成功。我们选择Fair的WMT19英语新闻翻译系统，而Mbart50则作为我们的预培训模型。考虑到任务使用培训数据的有限使用条件，我们遵循WET提出的数据增强策略，以提高我们的TS模型性能。不同之处在于，我们进一步涉及双条件跨膜模型和GPT-2语言模型来过滤增强数据。排行榜最终表明，在WMT22翻译建议任务的幼稚TS任务中，我们的意见书在四个语言方向中的三个方向中排名第一。

This paper describes the joint submission of Alibaba and Soochow University, TSMind, to the WMT 2022 Shared Task on Translation Suggestion (TS). We participate in the English-German and English-Chinese tasks. Basically, we utilize the model paradigm fine-tuning on the downstream tasks based on large-scale pre-trained models, which has recently achieved great success. We choose FAIR's WMT19 English-German news translation system and MBART50 for English-Chinese as our pre-trained models. Considering the task's condition of limited use of training data, we follow the data augmentation strategies proposed by WeTS to boost our TS model performance. The difference is that we further involve the dual conditional cross-entropy model and GPT-2 language model to filter augmented data. The leader board finally shows that our submissions are ranked first in three of four language directions in the Naive TS task of the WMT22 Translation Suggestion task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题