元镜下降：快速收敛的优化者学习

论文标题

元镜下降：快速收敛的优化者学习

Meta Mirror Descent: Optimiser Learning for Fast Convergence

论文作者

Gao, Boyan, Gouk, Henry, Lee, Hae Beom, Hospedales, Timothy M.

论文摘要

优化器是培训机器学习模型的重要组成部分，其设计影响学习速度和概括。几项研究试图通过解决双层优化问题来学习更有效的梯度降低优化剂，其中将概括误差相对于优化者参数最小化。但是，大多数现有的优化者学习方法是直观动机的，没有明确的理论支持。我们从镜下下降而不是梯度下降开始，从而采取不同的视角，并以相应的Bregman差异来学习。在这个范式中，我们正式化了一个新颖的元学习目标，以最大程度地减少学习的遗憾。所得的框架称为元镜下降（METAMD），学会了加速优化速度。与许多元学习优化者不同，它还支持收敛和概括保证，并且在不需要验证数据的情况下进行了唯一的选择。我们在收敛率和概括误差方面对各种任务和架构进行评估框架，并表现出强大的性能。

Optimisers are an essential component for training machine learning models, and their design influences learning speed and generalisation. Several studies have attempted to learn more effective gradient-descent optimisers via solving a bi-level optimisation problem where generalisation error is minimised with respect to optimiser parameters. However, most existing optimiser learning methods are intuitively motivated, without clear theoretical support. We take a different perspective starting from mirror descent rather than gradient descent, and meta-learning the corresponding Bregman divergence. Within this paradigm, we formalise a novel meta-learning objective of minimising the regret bound of learning. The resulting framework, termed Meta Mirror Descent (MetaMD), learns to accelerate optimisation speed. Unlike many meta-learned optimisers, it also supports convergence and generalisation guarantees and uniquely does so without requiring validation data. We evaluate our framework on a variety of tasks and architectures in terms of convergence rate and generalisation error and demonstrate strong performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题