双重下降：关于线性回归任务之间传输学习的概括错误

论文标题

双重下降：关于线性回归任务之间传输学习的概括错误

Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks

论文作者

Dar, Yehuda, Baraniuk, Richard G.

论文摘要

我们研究两个线性回归问题之间的转移学习过程。一个重要且及时的特殊情况是回归器过度参数化并完美地插入其训练数据时。我们检查了一个参数传输机制，从而将目标任务解决方案参数的子集限制在相关源任务中学到的值。我们通过分析以转移学习体系结构中的显着因素（即可用的示例数量，每个任务中的（自由）参数的数量，从源到目标任务传递到目标任务的参数数量以及两个任务之间的关系的参数数量，我们从分析表征了目标任务的概括误差。我们的非反应分析表明，目标任务的概括误差遵循由转移学习因素控制的二维双重下降趋势（相对于每个任务中的免费参数数量）。我们的分析指出，参数转移是有益的特定情况，可以代替额外的过度参数化（即目标任务中的其他免费参数）。具体而言，我们表明，转移学习设置的有用性是脆弱的，取决于一组转移参数之间的微妙相互作用，任务之间的关系以及真实的解决方案。我们还证明，当源任务更近或与目标任务相同时，过度参数化的转移学习不一定会更有益。

We study the transfer learning process between two linear regression problems. An important and timely special case is when the regressors are overparameterized and perfectly interpolate their training data. We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task. We analytically characterize the generalization error of the target task in terms of the salient factors in the transfer learning architecture, i.e., the number of examples available, the number of (free) parameters in each of the tasks, the number of parameters transferred from the source to target task, and the relation between the two tasks. Our non-asymptotic analysis shows that the generalization error of the target task follows a two-dimensional double descent trend (with respect to the number of free parameters in each of the tasks) that is controlled by the transfer learning factors. Our analysis points to specific cases where the transfer of parameters is beneficial as a substitute for extra overparameterization (i.e., additional free parameters in the target task). Specifically, we show that the usefulness of a transfer learning setting is fragile and depends on a delicate interplay among the set of transferred parameters, the relation between the tasks, and the true solution. We also demonstrate that overparameterized transfer learning is not necessarily more beneficial when the source task is closer or identical to the target task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题