论文标题
安德森坐标下降的加速
Anderson acceleration of coordinate descent
论文作者
论文摘要
一阶方法的加速主要是通过Nesterov或非线性外推通过惯性技术获得的。后者已经知道最近感兴趣的激增,并成功地应用了梯度和近端梯度技术。在多个机器学习问题上,协调下降的性能明显优于全梯度方法。在实践中加速坐标下降并不容易:理论上是偶然加速的坐标下降版本,但可能并不总是会导致实际加速。与惯性加速坐标下降和外推(近端)梯度下降相比,我们提出了使用外推的加速坐标下降的加速坐标下降。最小二乘,拉索,弹性网和逻辑回归的实验验证了该方法。
Acceleration of first order methods is mainly obtained via inertial techniques à la Nesterov, or via nonlinear extrapolation. The latter has known a recent surge of interest, with successful applications to gradient and proximal gradient techniques. On multiple Machine Learning problems, coordinate descent achieves performance significantly superior to full-gradient methods. Speeding up coordinate descent in practice is not easy: inertially accelerated versions of coordinate descent are theoretically accelerated, but might not always lead to practical speed-ups. We propose an accelerated version of coordinate descent using extrapolation, showing considerable speed up in practice, compared to inertial accelerated coordinate descent and extrapolated (proximal) gradient descent. Experiments on least squares, Lasso, elastic net and logistic regression validate the approach.