培训更强大的基准以学习优化

论文标题

培训更强大的基准以学习优化

Training Stronger Baselines for Learning to Optimize

论文作者

Chen, Tianlong, Zhang, Weiyi, Zhou, Jingyang, Chang, Shiyu, Liu, Sijia, Amini, Lisa, Wang, Zhangyang

论文摘要

学习优化（L2O）已经引起了人们的关注，因为经典优化器需要特定于问题的设计和超参数调整。但是，实际需求与现有L2O模型的可实现性能之间存在差距。具体来说，这些学识渊博的优化器仅适用于有限的问题，并且经常表现出不稳定。通过许多致力于设计更复杂的L2O模型的努力，我们主张另一个正交的，不足的主题：这些L2O模型的培训技术。我们表明，即使是最简单的L2O模型也可以得到更好的培训。我们首先提出了一种渐进式培训方案，以逐渐增加优化器的展开长度，以减轻众所周知的截断偏差（较短展开）与梯度爆炸（较长的展开）的L2O困境。我们进一步利用非政策模仿学习来指导L2O学习，并参考分析优化者的行为。我们改进的培训技术被插入各种最先进的L2O模型中，并立即提高其性能，而无需对其模型结构进行任何更改。特别是，通过我们提出的技术，可以训练最早，最简单的L2O模型，以优于许多任务上最新的复杂L2O模型。我们的结果表明，L2O的潜力更大，尚待释放，并敦促重新考虑最近的进展。我们的代码可公开可用：https：//github.com/vita-group/l2o-training-techniques。

Learning to optimize (L2O) has gained increasing attention since classical optimizers require laborious problem-specific design and hyperparameter tuning. However, there is a gap between the practical demand and the achievable performance of existing L2O models. Specifically, those learned optimizers are applicable to only a limited class of problems, and often exhibit instability. With many efforts devoted to designing more sophisticated L2O models, we argue for another orthogonal, under-explored theme: the training techniques for those L2O models. We show that even the simplest L2O model could have been trained much better. We first present a progressive training scheme to gradually increase the optimizer unroll length, to mitigate a well-known L2O dilemma of truncation bias (shorter unrolling) versus gradient explosion (longer unrolling). We further leverage off-policy imitation learning to guide the L2O learning, by taking reference to the behavior of analytical optimizers. Our improved training techniques are plugged into a variety of state-of-the-art L2O models, and immediately boost their performance, without making any change to their model structures. Especially, by our proposed techniques, an earliest and simplest L2O model can be trained to outperform the latest complicated L2O models on a number of tasks. Our results demonstrate a greater potential of L2O yet to be unleashed, and urge to rethink the recent progress. Our codes are publicly available at: https://github.com/VITA-Group/L2O-Training-Techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题