通过回收参数提示来减少再培训

论文标题

通过回收参数提示来减少再培训

Reducing Retraining by Recycling Parameter-Efficient Prompts

论文作者

Lester, Brian, Yurtsever, Joshua, Shakeri, Siamak, Constant, Noah

论文摘要

参数效率的方法能够使用单个冷冻的预训练的大语言模型（LLM）来通过学习特定于任务的软提示来执行许多任务，从而在串联到输入文本时调节模型行为。但是，这些学习的提示与给定的冷冻模型紧密耦合 - 如果模型已更新，则需要获得相应的新提示。在这项工作中，我们提出并调查了几种“提示回收”的方法，其中将在源模型上进行培训的及时训练以与新目标模型一起使用。我们的方法不依赖于目标模型的有监督的提示，特定于任务的数据或培训更新，这与从头开始使用目标模型重新调查提示一样昂贵。我们表明，模型之间的回收是可能的（我们的最佳设置能够成功回收$ 88.9 \％的提示，从而产生了一个提示，即表现出色的基线），但是剩下的大量性能净空，需要改进的回收技术。

Parameter-efficient methods are able to use a single frozen pre-trained large language model (LLM) to perform many tasks by learning task-specific soft prompts that modulate model behavior when concatenated to the input text. However, these learned prompts are tightly coupled to a given frozen model -- if the model is updated, corresponding new prompts need to be obtained. In this work, we propose and investigate several approaches to "Prompt Recycling'" where a prompt trained on a source model is transformed to work with the new target model. Our methods do not rely on supervised pairs of prompts, task-specific data, or training updates with the target model, which would be just as costly as re-tuning prompts with the target model from scratch. We show that recycling between models is possible (our best settings are able to successfully recycle $88.9\%$ of prompts, producing a prompt that out-performs baselines), but significant performance headroom remains, requiring improved recycling techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题