论文标题
在Minibatch重球动量的快速收敛性上
On the fast convergence of minibatch heavy ball momentum
论文作者
论文摘要
简单的随机动量方法被广泛用于机器学习优化,但它们的良好实践表现与文献中没有理论保证的理论保证相矛盾。在这项工作中,我们的目标是通过表明随机重球动量保留了二次优化问题的(确定性)重球动量的快速线性速率,至少在用足够大的批量大小时,在二次优化问题上保持了(确定性)重球动量的快速线性速率。我们研究的算法可以解释为具有小匹配和重球动量的加速随机Kaczmarz算法。该分析依赖于仔细分解动量过渡矩阵,并使用新的光谱范数浓度界限来进行独立随机矩阵的产物。我们提供数值插图,证明我们的边界相当尖锐。
Simple stochastic momentum methods are widely used in machine learning optimization, but their good practical performance is at odds with an absence of theoretical guarantees of acceleration in the literature. In this work, we aim to close the gap between theory and practice by showing that stochastic heavy ball momentum retains the fast linear rate of (deterministic) heavy ball momentum on quadratic optimization problems, at least when minibatching with a sufficiently large batch size. The algorithm we study can be interpreted as an accelerated randomized Kaczmarz algorithm with minibatching and heavy ball momentum. The analysis relies on carefully decomposing the momentum transition matrix, and using new spectral norm concentration bounds for products of independent random matrices. We provide numerical illustrations demonstrating that our bounds are reasonably sharp.