论文标题

关于法国提问任务的基于变形金刚的模型的可用性

On the Usability of Transformers-based models for a French Question-Answering task

论文作者

Cattan, Oralie, Servan, Christophe, Rosset, Sophie

论文摘要

对于许多任务,基于变压器的体系结构已经实现了最先进的结果,从而导致实践从使用特定任务架构到预训练的语言模型的微调。持续的趋势包括具有越来越多的数据和参数的培训模型,这需要大量资源。它导致了强有力的搜索,以提高基于仅针对英语评估的算法和硬件改进的算法和硬件改进。这引发了有关其可用性的疑问,当应用于小规模的学习问题时,对于资源不足的语言任务,有限的培训数据可用。缺乏适当尺寸的语料库是应用数据驱动和基于转移学习的方法的障碍。在本文中,我们建立了致力于基于变形金刚的模型可用性的最新努力,并建议评估这些改进的法语表演,而法语的提问很少。我们通过通过数据增强,超参数优化和跨语性转移来调查各种培训策略来解决与数据稀缺有关的不稳定。我们还为法国弗拉伯特(Fralbert)推出了一种新的紧凑型模型,该模型在低资源环境中被证明具有竞争力。

For many tasks, state-of-the-art results have been achieved with Transformer-based architectures, resulting in a paradigmatic shift in practices from the use of task-specific architectures to the fine-tuning of pre-trained language models. The ongoing trend consists in training models with an ever-increasing amount of data and parameters, which requires considerable resources. It leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for English. This raises questions about their usability when applied to small-scale learning problems, for which a limited amount of training data is available, especially for under-resourced languages tasks. The lack of appropriately sized corpora is a hindrance to applying data-driven and transfer learning-based approaches with strong instability cases. In this paper, we establish a state-of-the-art of the efforts dedicated to the usability of Transformer-based models and propose to evaluate these improvements on the question-answering performances of French language which have few resources. We address the instability relating to data scarcity by investigating various training strategies with data augmentation, hyperparameters optimization and cross-lingual transfer. We also introduce a new compact model for French FrALBERT which proves to be competitive in low-resource settings.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源