论文标题
FFR V1.0:Fon-French神经机器翻译
FFR V1.0: Fon-French Neural Machine Translation
论文作者
论文摘要
非洲具有世界上最高的语言多样性。由于语言对交流的重要性以及在现代文化跨文化交流中可靠,强大和准确的机器翻译模型的重要性,因此(并且仍然)努力为许多非洲语言创建最先进的翻译模型。但是,非洲语言的低资源,音调和音调复杂性是当今非洲NLP面临的主要问题。 FFR是创建从FON(一种非常低的资源和音调语言)到法语,用于研究和公众使用的强大翻译模型的主要一步。在本文中,我们描述了我们的试点项目:创建一个用于FON到FON-FRENCH翻译的大型生长语料库和在此数据集中培训的FFR V1.0模型。数据集和模型公开可用。
Africa has the highest linguistic diversity in the world. On account of the importance of language to communication, and the importance of reliable, powerful and accurate machine translation models in modern inter-cultural communication, there have been (and still are) efforts to create state-of-the-art translation models for the many African languages. However, the low-resources, diacritical and tonal complexities of African languages are major issues facing African NLP today. The FFR is a major step towards creating a robust translation model from Fon, a very low-resource and tonal language, to French, for research and public use. In this paper, we describe our pilot project: the creation of a large growing corpora for Fon-to-French translations and our FFR v1.0 model, trained on this dataset. The dataset and model are made publicly available.