关于数据在Pac-Bayes边界中的作用

论文标题

关于数据在Pac-Bayes边界中的作用

On the role of data in PAC-Bayes bounds

论文作者

Dziugaite, Gintare Karolina, Hsu, Kyle, Gharbieh, Waseem, Arpino, Gabriel, Roy, Daniel M.

论文摘要

Pac-Bayes边界中的主要术语通常是后与先验之间的kullback-leibler差异。对于基于固定后核的经验风险的所谓线性PACBayes风险界限，可以通过选择之前选择预期的后验来最大程度地降低界限的预期价值，我们将其称为分布依赖的帐户上的Oracle先验。在这项工作中，我们表明基于Oracle先验的界限可以次优：在某些情况下，通过使用数据依赖性的Oracle先验获得更强的结合，即给定后验的条件期望，鉴于训练数据的子集，然后将其子集排除在经验风险项之外。尽管使用数据来学习先验是已知的启发式方法，但其在最佳界限中的重要作用是新的。实际上，我们表明使用数据可能意味着空置和非裂变边界之间的差异。我们将这一新原则应用于非凸学习的设置，在使用和没有持有数据的情况下模拟与MNIST和时尚MNIST的数据相关的Oracle先验，并在这两种情况下都证明了新的非易变界限。

The dominant term in PAC-Bayes bounds is often the Kullback--Leibler divergence between the posterior and prior. For so-called linear PAC-Bayes risk bounds based on the empirical risk of a fixed posterior kernel, it is possible to minimize the expected value of the bound by choosing the prior to be the expected posterior, which we call the oracle prior on the account that it is distribution dependent. In this work, we show that the bound based on the oracle prior can be suboptimal: In some cases, a stronger bound is obtained by using a data-dependent oracle prior, i.e., a conditional expectation of the posterior, given a subset of the training data that is then excluded from the empirical risk term. While using data to learn a prior is a known heuristic, its essential role in optimal bounds is new. In fact, we show that using data can mean the difference between vacuous and nonvacuous bounds. We apply this new principle in the setting of nonconvex learning, simulating data-dependent oracle priors on MNIST and Fashion MNIST with and without held-out data, and demonstrating new nonvacuous bounds in both cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题