MUNET：将预处理的深神经网络发展为可扩展的自动调整多任务系统

论文标题

MUNET：将预处理的深神经网络发展为可扩展的自动调整多任务系统

muNet: Evolving Pretrained Deep Neural Networks into Scalable Auto-tuning Multitask Systems

论文作者

Gesmundo, Andrea, Dean, Jeff

论文摘要

当今的机器学习的大多数用途都涉及从头开始训练模型以执行特定任务，或者有时是从相关任务预估计的模型开始，然后在下游任务上进行微调。两种方法都在不同任务之间提供有限的知识转移，耗时的人为自定义对单个任务以及高计算成本，尤其是从随机初始化的模型开始时。我们提出了一种使用预处理的深神经网络层的方法，作为构建可以共同解决任意数量任务的ML系统的构建块。最终的系统可以利用交叉任务的知识转移，同时不受多任务方法的常见缺点，例如灾难性遗忘，梯度干扰和负转移。我们定义了一种旨在共同选择与每个任务相关的先验知识的进化方法，请选择模型参数的子集来训练并动态自动调整其超参数。此外，采用了一种新颖的量表控制方法来实现优于常见的微调技术的质量/尺寸权衡。与10个不同图像分类任务的基准上的标准微调相比，提出的模型将平均准确度提高了2.39％，而每个任务的参数少47％。

Most uses of machine learning today involve training a model from scratch for a particular task, or sometimes starting with a model pretrained on a related task and then fine-tuning on a downstream task. Both approaches offer limited knowledge transfer between different tasks, time-consuming human-driven customization to individual tasks and high computational costs especially when starting from randomly initialized models. We propose a method that uses the layers of a pretrained deep neural network as building blocks to construct an ML system that can jointly solve an arbitrary number of tasks. The resulting system can leverage cross tasks knowledge transfer, while being immune from common drawbacks of multitask approaches such as catastrophic forgetting, gradients interference and negative transfer. We define an evolutionary approach designed to jointly select the prior knowledge relevant for each task, choose the subset of the model parameters to train and dynamically auto-tune its hyperparameters. Furthermore, a novel scale control method is employed to achieve quality/size trade-offs that outperform common fine-tuning techniques. Compared with standard fine-tuning on a benchmark of 10 diverse image classification tasks, the proposed model improves the average accuracy by 2.39% while using 47% less parameters per task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题