AdapterHub：适应变压器的框架

论文标题

AdapterHub：适应变压器的框架

AdapterHub: A Framework for Adapting Transformers

论文作者

Pfeiffer, Jonas, Rücklé, Andreas, Poth, Clifton, Kamath, Aishwarya, Vulić, Ivan, Ruder, Sebastian, Cho, Kyunghyun, Gurevych, Iryna

论文摘要

NLP中当前的作案手法涉及下载和微调预训练的模型，该模型由数百万或数十亿个参数组成。存储和共享如此大型训练的模型是昂贵，缓慢且耗时的，这阻碍了进步的更通用和多功能的NLP方法，这些方法从许多任务中学习。适配器 - 小型学习的瓶颈层插入了预训练模型的每一层中 - 通过避免对整个模型进行全面微调来改善此问题。但是，共享和集成适配器层并不直接。我们提出了AdapterHub，这是一个框架，允许对不同任务和语言进行预训练的适配器的动态“缝合”。该框架建在流行的Huggingface Transfereers库之上，可实现任务和语言的最先进的预训练模型（例如Bert，Roberta，XLM-R）的极其简单和快速的改编。使用培训脚本和专业基础架构的最小更改，下载，共享和培训适配器尽可能无缝。我们的框架可扩展，轻松地访问特定于任务的模型，尤其是在低资源场景中。 AdapterHub包括所有最近的适配器架构，可以在https://adapterhub.ml上找到。

The current modus operandi in NLP involves downloading and fine-tuning pre-trained models consisting of millions or billions of parameters. Storing and sharing such large trained models is expensive, slow, and time-consuming, which impedes progress towards more general and versatile NLP methods that learn from and for many tasks. Adapters -- small learnt bottleneck layers inserted within each layer of a pre-trained model -- ameliorate this issue by avoiding full fine-tuning of the entire model. However, sharing and integrating adapter layers is not straightforward. We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages. The framework, built on top of the popular HuggingFace Transformers library, enables extremely easy and quick adaptations of state-of-the-art pre-trained models (e.g., BERT, RoBERTa, XLM-R) across tasks and languages. Downloading, sharing, and training adapters is as seamless as possible using minimal changes to the training scripts and a specialized infrastructure. Our framework enables scalable and easy access to sharing of task-specific models, particularly in low-resource scenarios. AdapterHub includes all recent adapter architectures and can be found at https://AdapterHub.ml.

下载PDF全文

下载文献需遵守相关版权规定

论文标题