具有延迟梯度的随机梯度Langevin

论文标题

具有延迟梯度的随机梯度Langevin

Stochastic Gradient Langevin with Delayed Gradients

论文作者

Kungurtsev, Vyacheslav, Chatterjee, Bapi, Alistarh, Dan

论文摘要

随机梯度Langevin Dynamics（SGLD）可确保通过在随机梯度迭代中添加噪声来衡量对数对数conconcove后分布的收敛性。鉴于许多实际问题的大小，在几个异步运行的处理器中并行化是减少随机优化算法的端到端计算时间的流行策略。在本文中，我们是第一个研究异步计算效果的人，特别是在延迟迭代时对随机Langevin梯度的评估对度量收敛性的评估。为此，我们利用了对Langevin动力学建模的最新结果，因为它解决了措施空间上的凸优化问题。我们表明，量度收敛速率并未受到用于计算的延迟梯度信息引起的误差的显着影响，这表明在壁时钟时间中加速的巨大潜力。我们通过有关某些实际问题的数值实验来确认我们的理论结果。

Stochastic Gradient Langevin Dynamics (SGLD) ensures strong guarantees with regards to convergence in measure for sampling log-concave posterior distributions by adding noise to stochastic gradient iterates. Given the size of many practical problems, parallelizing across several asynchronously running processors is a popular strategy for reducing the end-to-end computation time of stochastic optimization algorithms. In this paper, we are the first to investigate the effect of asynchronous computation, in particular, the evaluation of stochastic Langevin gradients at delayed iterates, on the convergence in measure. For this, we exploit recent results modeling Langevin dynamics as solving a convex optimization problem on the space of measures. We show that the rate of convergence in measure is not significantly affected by the error caused by the delayed gradient information used for computation, suggesting significant potential for speedup in wall clock time. We confirm our theoretical results with numerical experiments on some practical problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题