带有改组的SGD：没有组件凸的最佳速率和较大的时期要求

论文标题

带有改组的SGD：没有组件凸的最佳速率和较大的时期要求

SGD with shuffling: optimal rates without component convexity and large epoch requirements

论文作者

Ahn, Kwangjun, Yun, Chulhee, Sra, Suvrit

论文摘要

我们研究没有替代SGD来解决有限和优化问题。具体而言，根据有限和索引的索引如何洗牌，我们考虑随机扣除（每个时代开始时的洗牌）和单打拆除（仅一次洗牌一次）算法。首先，我们建立了这些算法的最小值最佳收敛速率，直到多志因子。值得注意的是，我们的分析足以支付梯度主导的非凸成本，并且不依赖于单个组件功能的凸度，这与现有的最佳收敛结果不同。其次，假设单个组件的凸度，我们通过消除所有先前艺术的缺点，进一步增强了随机归因的紧密收敛结果：结果所需的大量时期所需的大量时期，以及额外的多log因子差距到下限。

We study without-replacement SGD for solving finite-sum optimization problems. Specifically, depending on how the indices of the finite-sum are shuffled, we consider the RandomShuffle (shuffle at the beginning of each epoch) and SingleShuffle (shuffle only once) algorithms. First, we establish minimax optimal convergence rates of these algorithms up to poly-log factors. Notably, our analysis is general enough to cover gradient dominated nonconvex costs, and does not rely on the convexity of individual component functions unlike existing optimal convergence results. Secondly, assuming convexity of the individual components, we further sharpen the tight convergence results for RandomShuffle by removing the drawbacks common to all prior arts: large number of epochs required for the results to hold, and extra poly-log factor gaps to the lower bound.

下载PDF全文

下载文献需遵守相关版权规定

论文标题