雪松：一种大型自回归法语模型

论文标题

雪松：一种大型自回归法语模型

Cedille: A large autoregressive French language model

论文作者

Müller, Martin, Laurent, Florian

论文摘要

扩大自回归语言模型的规模和培训，已使使用零射击和少量学习来解决自然语言处理任务的新颖方法。尽管GPT-3之类的极端语言模型提供了多语言功能，但英语以外的其他语言的零声学学习仍然很大程度上没有探索。在这里，我们介绍了大型开源自动回归语言模型Cedille，专门针对法语培训。我们的结果表明，雪松的表现优于现有的法语模型，并且在一系列法国零击基准上与GPT-3具有竞争力。此外，我们提供了这些模型所表现出的毒性的深入比较，这表明雪松标志着通过数据集滤波，雪松质标志着语言模型安全性的改善。

Scaling up the size and training of autoregressive language models has enabled novel ways of solving Natural Language Processing tasks using zero-shot and few-shot learning. While extreme-scale language models such as GPT-3 offer multilingual capabilities, zero-shot learning for languages other than English remain largely unexplored. Here, we introduce Cedille, a large open source auto-regressive language model, specifically trained for the French language. Our results show that Cedille outperforms existing French language models and is competitive with GPT-3 on a range of French zero-shot benchmarks. Furthermore, we provide an in-depth comparison of the toxicity exhibited by these models, showing that Cedille marks an improvement in language model safety thanks to dataset filtering.

下载PDF全文

下载文献需遵守相关版权规定

论文标题