论文标题
使用基于归因的修剪的多任务语言模型的特定任务压缩
Task-specific Compression for Multi-task Language Models using Attribution-based Pruning
论文作者
论文摘要
多任务语言模型仅使用单个模型显示出出色的性能,用于各种自然语言理解任务。但是,即使仅用于特定任务时,这些语言模型即使仅用于特定任务,也利用了不必要的模型参数。本文提出了一种使用修剪方法的多任务语言模型的新型无培训压缩方法。具体而言,我们使用归因方法来确定哪些神经元对于执行特定任务至关重要。我们特定于任务修剪不重要的神经元,仅留下特定于任务的参数。此外,我们将方法扩展到适用于低资源和无监督的设置。由于我们的压缩方法是无培训的,因此它使用的是很少的计算资源,并且不会破坏语言模型的预训练知识。六个广泛使用的数据集的实验结果表明,我们提出的修剪方法显着优于基线修剪方法。此外,我们证明即使在看不见的域设置中,我们的方法也可以保留性能。
Multi-task language models show outstanding performance for various natural language understanding tasks with only a single model. However, these language models utilize an unnecessarily large number of model parameters, even when used only for a specific task. This paper proposes a novel training-free compression method for multi-task language models using a pruning method. Specifically, we use an attribution method to determine which neurons are essential for performing a specific task. We task-specifically prune unimportant neurons and leave only task-specific parameters. Furthermore, we extend our method to be applicable in low-resource and unsupervised settings. Since our compression method is training-free, it uses few computing resources and does not destroy the pre-trained knowledge of language models. Experimental results on the six widely-used datasets show that our proposed pruning method significantly outperforms baseline pruning methods. In addition, we demonstrate that our method preserves performance even in an unseen domain setting.