论文标题
无Hessian的高分辨率Nesterov加速度用于采样
Hessian-Free High-Resolution Nesterov Acceleration for Sampling
论文作者
论文摘要
当使用有限的踏板尺寸\ citep {shi20211undanding}时,Nesterov的加速梯度(NAG)进行优化的性能比其连续的时间限制(无噪声动力学Langevin)更好。这项工作探讨了该现象的抽样对应物,并提出了一个扩散过程,其离散化可以产生加速的基于梯度的MCMC方法。更确切地说,我们将NAG的优化器重新制定为强烈凸功能(NAG-SC)作为无HESSIAN的高分辨率ODE,将其高分辨率系数更改为超参数,注入适当的噪声,并将所得的扩散过程离散。新的高参数的加速效应是量化的,它不是由时间响应创造的人造效应。取而代之的是,在连续动力学级别和离散算法级别上,在$ w_2 $距离中以$ W_2 $距离的加速度均已定量确定。在对数孔隙和多模式案例中的经验实验也证明了这一加速度。
Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed \citep{shi2021understanding}. This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods. More precisely, we reformulate the optimizer of NAG for strongly convex functions (NAG-SC) as a Hessian-Free High-Resolution ODE, change its high-resolution coefficient to a hyperparameter, inject appropriate noise, and discretize the resulting diffusion process. The acceleration effect of the new hyperparameter is quantified and it is not an artificial one created by time-rescaling. Instead, acceleration beyond underdamped Langevin in $W_2$ distance is quantitatively established for log-strongly-concave-and-smooth targets, at both the continuous dynamics level and the discrete algorithm level. Empirical experiments in both log-strongly-concave and multi-modal cases also numerically demonstrate this acceleration.