论文标题

关于Langevin Monte Carlo的收敛:尾巴生长与光滑度之间的相互作用

On the Convergence of Langevin Monte Carlo: The Interplay between Tail Growth and Smoothness

论文作者

Erdogdu, Murat A., Hosseinzadeh, Rasa

论文摘要

我们使用未调整的Langevin Monte Carlo(LMC)算法从目标分布$ {ν_* = E^{ - f}} $研究采样。对于任何潜在函数$ f $,其尾巴的行为就像$ {\ | x \ |^α} $对于$ {α\ in [1,2]} $,并且具有$β$-Hölder连续梯度,我们证明了$ {\ wideTilde {\ widetilde {\ Mathcal {\ artcal {o}}} \ big(d^{\ frac {1}β+\ frac {1+β}β(\ frac {2}α-\ boldsymbol {1} _ {\ {\ {α\ {α\ neq 1 \}}} $ε$ - $ d $二维目标分布$ν_*$ in Kl-Divergence。只要其生长至少是线性的,这种收敛速率,就$ε$依赖性而言,不会直接受到潜在功能的尾巴增长率$α$的影响,并且仅依赖于平滑度$β$的顺序。这一结果的一个值得注意的结果是,对于Lipschitz梯度的潜力,即$β= 1 $,我们的费率恢复了最著名的费率$ {\ didetilde {\ didetilde {\ Mathcal {o}}(dε^{ - 1}} $,这是为了在$的clifeency中确定的,但是我们在$的clignence中确定了$β的潜在范围,但在无穷大处是简单的凸。增长率$α$开始对$ d $大的高尺寸的既定利率产生影响;此外,当电势的尾部生长是二次的,即$ {α= 2} $,在当前设置中,它恢复了最著名的维度依赖性。我们的框架允许有限的扰动,任何平滑度的顺序$ {β\ in(0,1]} $;因此,我们的结果适用于弱平滑且至少具有线性尾巴生长的宽类非凸电势。

We study sampling from a target distribution ${ν_* = e^{-f}}$ using the unadjusted Langevin Monte Carlo (LMC) algorithm. For any potential function $f$ whose tails behave like ${\|x\|^α}$ for ${α\in [1,2]}$, and has $β$-Hölder continuous gradient, we prove that ${\widetilde{\mathcal{O}} \Big(d^{\frac{1}β+\frac{1+β}β(\frac{2}α - \boldsymbol{1}_{\{α\neq 1\}})} ε^{-\frac{1}β}\Big)}$ steps are sufficient to reach the $ε$-neighborhood of a $d$-dimensional target distribution $ν_*$ in KL-divergence. This convergence rate, in terms of $ε$ dependency, is not directly influenced by the tail growth rate $α$ of the potential function as long as its growth is at least linear, and it only relies on the order of smoothness $β$. One notable consequence of this result is that for potentials with Lipschitz gradient, i.e. $β=1$, our rate recovers the best known rate ${\widetilde{\mathcal{O}}(dε^{-1})}$ which was established for strongly convex potentials in terms of $ε$ dependency, but we show that the same rate is achievable for a wider class of potentials that are degenerately convex at infinity. The growth rate $α$ starts to have an effect on the established rate in high dimensions where $d$ is large; furthermore, it recovers the best-known dimension dependency when the tail growth of the potential is quadratic, i.e. ${α= 2}$, in the current setup. Our framework allows for finite perturbations, and any order of smoothness ${β\in(0,1]}$; consequently, our results are applicable to a wide class of non-convex potentials that are weakly smooth and exhibit at least linear tail growth.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源