关于聚合重球法的收敛分析

论文标题

关于聚合重球法的收敛分析

On the Convergence Analysis of Aggregated Heavy-Ball Method

论文作者

Danilova, Marina

论文摘要

动量的一阶优化方法是各种优化任务中的工作马，例如，在深层神经网络的培训中。最近，Lucas等人。（2019年）提出了一种称为聚合重球（AGGHB）的方法，该方法使用对应于不同动量参数的多个动量向量，并平均这些向量来计算每次迭代处的更新方向。卢卡斯等人。（2019年）表明，即使具有大动量参数，AGGHB也比经典的重球方法更稳定，并且在实践中表现良好。但是，仅分析该方法的二次目标和在统一边界梯度假设下的在线优化任务，这对于许多实际上重要的问题不满足。在这项工作中，我们解决了这个问题，并提出了AGGHB的首次分析，以在没有其他限制性假设的情况下，在非凸，凸面和强烈凸案件中进行平稳的目标函数。我们的复杂性结果与重球方法最著名的结果相匹配。我们还在数值上说明了AggHB在几个非凸和凸问题上的效率。

Momentum first-order optimization methods are the workhorses in various optimization tasks, e.g., in the training of deep neural networks. Recently, Lucas et al. (2019) proposed a method called Aggregated Heavy-Ball (AggHB) that uses multiple momentum vectors corresponding to different momentum parameters and averages these vectors to compute the update direction at each iteration. Lucas et al. (2019) show that AggHB is more stable than the classical Heavy-Ball method even with large momentum parameters and performs well in practice. However, the method was analyzed only for quadratic objectives and for online optimization tasks under uniformly bounded gradients assumption, which is not satisfied for many practically important problems. In this work, we address this issue and propose the first analysis of AggHB for smooth objective functions in non-convex, convex, and strongly convex cases without additional restrictive assumptions. Our complexity results match the best-known ones for the Heavy-Ball method. We also illustrate the efficiency of AggHB numerically on several non-convex and convex problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题