各向同性SGD：贝叶斯后抽样的实用方法

论文标题

各向同性SGD：贝叶斯后抽样的实用方法

Isotropic SGD: a Practical Approach to Bayesian Posterior Sampling

论文作者

Franzese, Giulio, Candela, Rosa, Milios, Dimitrios, Filippone, Maurizio, Michiardi, Pietro

论文摘要

在这项工作中，我们定义了一个统一的数学框架，以加深我们对马尔可夫链蒙特卡洛采样（SGMCMC）算法的随机梯度（SG）噪声作用的理解。我们的公式解锁了一种新颖，实用的后验采样方法的设计，这使SG噪声各向同性使用我们在分析上确定的固定学习率，并且需要比现有算法更弱的假设。相比之下，现有\ SGMCMC算法的共同特征是通过淹没添加噪声（退火率）或对\ sg噪声协方差的限制性假设来近似各向同性条件。广泛的实验验证表明，我们的建议与\ sgmcmc的最先进的竞争性竞争，同时更可实用。

In this work we define a unified mathematical framework to deepen our understanding of the role of stochastic gradient (SG) noise on the behavior of Markov chain Monte Carlo sampling (SGMCMC) algorithms. Our formulation unlocks the design of a novel, practical approach to posterior sampling, which makes the SG noise isotropic using a fixed learning rate that we determine analytically, and that requires weaker assumptions than existing algorithms. In contrast, the common traits of existing \sgmcmc algorithms is to approximate the isotropy condition either by drowning the gradients in additive noise (annealing the learning rate) or by making restrictive assumptions on the \sg noise covariance and the geometry of the loss landscape. Extensive experimental validations indicate that our proposal is competitive with the state-of-the-art on \sgmcmc, while being much more practical to use.

下载PDF全文

下载文献需遵守相关版权规定

论文标题