论文标题
基于稳定性的概括范围,用于指数家族langevin动力学
Stability Based Generalization Bounds for Exponential Family Langevin Dynamics
论文作者
论文摘要
近年来,基于稳定性的嘈杂随机算法的概括范围的进步,尤其是随机梯度Langevin动力学(SGLD)(Mou等,2018; Li等,2020)和信息理论方法(Xu and Raginsky,2017; Negrea et al。在本文中,我们统一并实质上概括了基于稳定性的概括范围,并做出了三个技术贡献。首先,我们以预期(而不是统一)稳定性限制了概括误差,这可以说导致定量更加尖锐。其次,作为我们的主要贡献,我们引入了指数式的langevin Dynamics(EFLD),这是对SGLD的实质性概括,其中包括SIGN-SGD和量化SGD的嘈杂版本,作为特殊情况。我们建立了具有O(1/N)样本依赖性的任何EFLD算法的基于数据依赖性期望稳定性界限,并且对梯度差异而不是梯度的范围,从而产生了明显的更加尖锐的界限。第三,我们为EFLD的特殊情况建立优化保证。此外,基准的经验结果表明,我们的界限不是易变的,定量比现有界限更加明显,并且在嘈杂的标签下表现正确。
Recent years have seen advances in generalization bounds for noisy stochastic algorithms, especially stochastic gradient Langevin dynamics (SGLD) based on stability (Mou et al., 2018; Li et al., 2020) and information theoretic approaches (Xu and Raginsky, 2017; Negrea et al., 2019; Steinke and Zakynthinou, 2020). In this paper, we unify and substantially generalize stability based generalization bounds and make three technical contributions. First, we bound the generalization error in terms of expected (not uniform) stability which arguably leads to quantitatively sharper bounds. Second, as our main contribution, we introduce Exponential Family Langevin Dynamics (EFLD), a substantial generalization of SGLD, which includes noisy versions of Sign-SGD and quantized SGD as special cases. We establish data-dependent expected stability based generalization bounds for any EFLD algorithm with a O(1/n) sample dependence and dependence on gradient discrepancy rather than the norm of gradients, yielding significantly sharper bounds. Third, we establish optimization guarantees for special cases of EFLD. Further, empirical results on benchmarks illustrate that our bounds are non-vacuous, quantitatively sharper than existing bounds, and behave correctly under noisy labels.