论文标题
坐在MixMt 2022:流利的翻译基于巨型预训练的模型
SIT at MixMT 2022: Fluent Translation Built on Giant Pre-trained Models
论文作者
论文摘要
本文介绍了史蒂文斯技术研究所的WMT 2022共享任务:混合机器翻译(MixMT)。该任务由两个子任务组成,子任务$ 1 $印地语/英语到hinglish和subtask $ 2 $ hinglish到英语翻译。我们的发现在于通过使用大型预训练的多语言NMT模型和内域数据集以及背面翻译和集成技术来改进的。使用Rouge-L和WER自动评估翻译输出。根据Rouge-L,WER和人类评估,我们的系统在子任务上达到了$ 1^{st} $位置,$ 1^{st} $在子任务上$ 1 $ $ 1 $根据WER和人类评估,以及$ 3^{rd} $在子任务上对Rouge-l Metric的subtask $ 1 $。
This paper describes the Stevens Institute of Technology's submission for the WMT 2022 Shared Task: Code-mixed Machine Translation (MixMT). The task consisted of two subtasks, subtask $1$ Hindi/English to Hinglish and subtask $2$ Hinglish to English translation. Our findings lie in the improvements made through the use of large pre-trained multilingual NMT models and in-domain datasets, as well as back-translation and ensemble techniques. The translation output is automatically evaluated against the reference translations using ROUGE-L and WER. Our system achieves the $1^{st}$ position on subtask $2$ according to ROUGE-L, WER, and human evaluation, $1^{st}$ position on subtask $1$ according to WER and human evaluation, and $3^{rd}$ position on subtask $1$ with respect to ROUGE-L metric.