论文标题

基准为阿塞拜疆神经机器翻译

Benchmarking Azerbaijani Neural Machine Translation

论文作者

Chen, Chih-Chen, Chen, William

论文摘要

关于阿塞拜疆的神经机器翻译(NMT)的研究很少。在本文中,我们基于阿塞拜疆 - 英语NMT系统在各种技术和数据集上的性能。我们评估哪种细分技术在阿塞拜疆翻译上最有效,并基准了阿塞拜疆NMT模型在几个文本领域的性能。我们的结果表明,虽然UMIGRAM细分改善了NMT的性能,而Azerbaijani翻译模型比数量更好地使用数据集质量缩放,但跨域泛化仍然是一个挑战

Little research has been done on Neural Machine Translation (NMT) for Azerbaijani. In this paper, we benchmark the performance of Azerbaijani-English NMT systems on a range of techniques and datasets. We evaluate which segmentation techniques work best on Azerbaijani translation and benchmark the performance of Azerbaijani NMT models across several domains of text. Our results show that while Unigram segmentation improves NMT performance and Azerbaijani translation models scale better with dataset quality than quantity, cross-domain generalization remains a challenge

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源