论文标题
AdapterHub:适应变压器的框架
AdapterHub: A Framework for Adapting Transformers
论文作者
论文摘要
NLP中当前的作案手法涉及下载和微调预训练的模型,该模型由数百万或数十亿个参数组成。存储和共享如此大型训练的模型是昂贵,缓慢且耗时的,这阻碍了进步的更通用和多功能的NLP方法,这些方法从许多任务中学习。适配器 - 小型学习的瓶颈层插入了预训练模型的每一层中 - 通过避免对整个模型进行全面微调来改善此问题。但是,共享和集成适配器层并不直接。我们提出了AdapterHub,这是一个框架,允许对不同任务和语言进行预训练的适配器的动态“缝合”。该框架建在流行的Huggingface Transfereers库之上,可实现任务和语言的最先进的预训练模型(例如Bert,Roberta,XLM-R)的极其简单和快速的改编。使用培训脚本和专业基础架构的最小更改,下载,共享和培训适配器尽可能无缝。我们的框架可扩展,轻松地访问特定于任务的模型,尤其是在低资源场景中。 AdapterHub包括所有最近的适配器架构,可以在https://adapterhub.ml上找到。
The current modus operandi in NLP involves downloading and fine-tuning pre-trained models consisting of millions or billions of parameters. Storing and sharing such large trained models is expensive, slow, and time-consuming, which impedes progress towards more general and versatile NLP methods that learn from and for many tasks. Adapters -- small learnt bottleneck layers inserted within each layer of a pre-trained model -- ameliorate this issue by avoiding full fine-tuning of the entire model. However, sharing and integrating adapter layers is not straightforward. We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages. The framework, built on top of the popular HuggingFace Transformers library, enables extremely easy and quick adaptations of state-of-the-art pre-trained models (e.g., BERT, RoBERTa, XLM-R) across tasks and languages. Downloading, sharing, and training adapters is as seamless as possible using minimal changes to the training scripts and a specialized infrastructure. Our framework enables scalable and easy access to sharing of task-specific models, particularly in low-resource scenarios. AdapterHub includes all recent adapter architectures and can be found at https://AdapterHub.ml.