任务格局如何影响MAML的性能？

论文标题

任务格局如何影响MAML的性能？

How Does the Task Landscape Affect MAML Performance?

论文作者

Collins, Liam, Mokhtari, Aryan, Shakkottai, Sanjay

论文摘要

模型不合时宜的元学习（MAML）已越来越流行，这些训练模型可以通过一个或几个随机梯度下降步骤迅速适应新任务。但是，与标准的非自适应学习（NAL）相比，MAML目标更难优化，并且在各种情况下其解决方案的快速适应性方面的MAML对NAL的改善程度很小。我们通过线性回归设置进行分析解决此问题，该设置由简单而艰难的任务组成，在该设置中，硬度与梯度下降在任务上收敛的速率有关。具体而言，我们证明，为了使MAML比NAL获得可观的增长，（i）任务之间的硬度必须有一些差异，并且（ii）硬任务的最佳解决方案必须与远离简单任务最佳解决方案中心的中心紧密地包装。我们还提供数值和分析结果，表明这些见解适用于两层神经网络。最后，我们提供了很少的图像分类实验，可以支持我们何时使用MAML的见解，并强调培训MAML对实践中的艰巨任务的重要性。

Model-Agnostic Meta-Learning (MAML) has become increasingly popular for training models that can quickly adapt to new tasks via one or few stochastic gradient descent steps. However, the MAML objective is significantly more difficult to optimize compared to standard non-adaptive learning (NAL), and little is understood about how much MAML improves over NAL in terms of the fast adaptability of their solutions in various scenarios. We analytically address this issue in a linear regression setting consisting of a mixture of easy and hard tasks, where hardness is related to the rate that gradient descent converges on the task. Specifically, we prove that in order for MAML to achieve substantial gain over NAL, (i) there must be some discrepancy in hardness among the tasks, and (ii) the optimal solutions of the hard tasks must be closely packed with the center far from the center of the easy tasks optimal solutions. We also give numerical and analytical results suggesting that these insights apply to two-layer neural networks. Finally, we provide few-shot image classification experiments that support our insights for when MAML should be used and emphasize the importance of training MAML on hard tasks in practice.

下载PDF全文

下载文献需遵守相关版权规定

论文标题