无Hessian的高分辨率Nesterov加速度用于采样

论文标题

无Hessian的高分辨率Nesterov加速度用于采样

Hessian-Free High-Resolution Nesterov Acceleration for Sampling

论文作者

Li, Ruilin, Zha, Hongyuan, Tao, Molei

论文摘要

当使用有限的踏板尺寸\ citep {shi20211undanding}时，Nesterov的加速梯度（NAG）进行优化的性能比其连续的时间限制（无噪声动力学Langevin）更好。这项工作探讨了该现象的抽样对应物，并提出了一个扩散过程，其离散化可以产生加速的基于梯度的MCMC方法。更确切地说，我们将NAG的优化器重新制定为强烈凸功能（NAG-SC）作为无HESSIAN的高分辨率ODE，将其高分辨率系数更改为超参数，注入适当的噪声，并将所得的扩散过程离散。新的高参数的加速效应是量化的，它不是由时间响应创造的人造效应。取而代之的是，在连续动力学级别和离散算法级别上，在$ w_2 $距离中以$ W_2 $距离的加速度均已定量确定。在对数孔隙和多模式案例中的经验实验也证明了这一加速度。

Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed \citep{shi2021understanding}. This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods. More precisely, we reformulate the optimizer of NAG for strongly convex functions (NAG-SC) as a Hessian-Free High-Resolution ODE, change its high-resolution coefficient to a hyperparameter, inject appropriate noise, and discretize the resulting diffusion process. The acceleration effect of the new hyperparameter is quantified and it is not an artificial one created by time-rescaling. Instead, acceleration beyond underdamped Langevin in $W_2$ distance is quantitatively established for log-strongly-concave-and-smooth targets, at both the continuous dynamics level and the discrete algorithm level. Empirical experiments in both log-strongly-concave and multi-modal cases also numerically demonstrate this acceleration.

下载PDF全文

下载文献需遵守相关版权规定

论文标题