随机子空间立方体牛顿法

论文标题

随机子空间立方体牛顿法

Stochastic Subspace Cubic Newton Method

论文作者

Hanzely, Filip, Doikov, Nikita, Richtárik, Peter, Nesterov, Yurii

论文摘要

在本文中，我们提出了一种新的随机二阶优化算法---随机子空间立方体牛顿（SSCN）---用于最小化高维凸功能$ f $。我们的方法既可以看作是Nesterov和Polyak（2006）的Cubialdized Newton方法的{\ EM随机}延伸，又是Kozak等人的随机子空间下降的{\ em二阶}增强。（2019）。我们证明，随着我们的小数大小，随机坐标下降速率（CD）和立方正规化牛顿的速率之间的SSCN插入式全局收敛速率，从而对一阶方法和二阶方法之间的连接提供了新的见解。值得注意的是，SSCN的局部收敛速率与最小化二次函数$ \ frac12（x-x^*）^\ top \ top \ nabla^2f（x^*）（x-x^*）（x-x^*）$最小化的随机子空间下降速率相匹配，其中$ x^*$是$ f $ f $ f $ f。我们的数值实验表明，SSCN的表现优于非加速的一阶CD算法，同时竞争其加速变体。

In this paper, we propose a new randomized second-order optimization algorithm---Stochastic Subspace Cubic Newton (SSCN)---for minimizing a high dimensional convex function $f$. Our method can be seen both as a {\em stochastic} extension of the cubically-regularized Newton method of Nesterov and Polyak (2006), and a {\em second-order} enhancement of stochastic subspace descent of Kozak et al. (2019). We prove that as we vary the minibatch size, the global convergence rate of SSCN interpolates between the rate of stochastic coordinate descent (CD) and the rate of cubic regularized Newton, thus giving new insights into the connection between first and second-order methods. Remarkably, the local convergence rate of SSCN matches the rate of stochastic subspace descent applied to the problem of minimizing the quadratic function $\frac12 (x-x^*)^\top \nabla^2f(x^*)(x-x^*)$, where $x^*$ is the minimizer of $f$, and hence depends on the properties of $f$ at the optimum only. Our numerical experiments show that SSCN outperforms non-accelerated first-order CD algorithms while being competitive to their accelerated variants.

下载PDF全文

下载文献需遵守相关版权规定

论文标题