论文标题
根-SGD:单个算法中尖锐的非肌瘤和近乎渐近的渐近学
ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm
论文作者
论文摘要
我们研究了使用随机一阶算法解决强烈凸出和平滑不受限制优化问题的问题。我们设计了一种新型算法,称为递归单位T SGD(root-SGD),基于易于实现的过去随机梯度的递归平均。我们证明,它同时在有限样本,非反应感和渐近感中都能达到最新的性能。在非反应方面,我们证明了root-sgd的最后一个迭代的风险界限,并带有前阶术语,这些术语与Unity Prefaltor的最佳统计风险相匹配,并在lipschitz条件下以$ O(N^{ - 3/2})的尖锐速率缩放,以$ O(n^{ - 3/3/2})的清晰速度缩放。在渐近方面,我们表明,当强加了轻度的单点黑森连续性条件时,(多上述)root-sgd的最后一次迭代将渐近地收敛至高斯极限,以cramér-rao最佳的渐近差异协方差,以实现较宽的选择。
We study the problem of solving strongly convex and smooth unconstrained optimization problems using stochastic first-order algorithms. We devise a novel algorithm, referred to as Recursive One-Over-T SGD (ROOT-SGD), based on an easily implementable, recursive averaging of past stochastic gradients. We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and an asymptotic sense. On the non-asymptotic side, we prove risk bounds on the last iterate of ROOT-SGD with leading-order terms that match the optimal statistical risk with a unity pre-factor, along with a higher-order term that scales at the sharp rate of $O(n^{-3/2})$ under the Lipschitz condition on the Hessian matrix. On the asymptotic side, we show that when a mild, one-point Hessian continuity condition is imposed, the rescaled last iterate of (multi-epoch) ROOT-SGD converges asymptotically to a Gaussian limit with the Cramér-Rao optimal asymptotic covariance, for a broad range of step-size choices.