论文标题
$ \ Mathcal {m} $ - 开放设置中的贝叶斯模型选择 - 近似后推理和概率 - 概率到尺寸的大小子采样
Bayesian model selection in the $\mathcal{M}$-open setting -- Approximate posterior inference and probability-proportional-to-size subsampling for efficient large-scale leave-one-out cross-validation
论文作者
论文摘要
竞争统计模型的比较是心理学研究的重要组成部分。从贝叶斯的角度来看,文献中提出了各种模型比较和选择的方法。但是,这些方法的适用性在很大程度上取决于有关模型空间$ \ Mathcal {M} $的假设,即所谓的模型视图。此外,诸如保留的交叉验证(LOO-CV)之类的传统方法估计了模型的预期对数预测密度(ELPD),以研究该模型在样本外如何泛型,当样本大小变大时,该模型迅速在计算上效率低下。在这里,我们提供了一个有关近似帕累托平滑的重要性抽样的教程,该抽样剩余的交叉验证(PSIS-loo)是一种用于贝叶斯模型比较的计算有效方法。首先,我们讨论了几种模型视图以及每种模型的可用模型比较方法。然后,我们将贝叶斯逻辑回归用作运行示例,如何在实践中应用该方法,并表明它在计算工作方面优于诸如loo-cv或信息标准的其他方法,同时提供了类似准确的ELPD估计。在第二步中,我们展示了如何通过使用后近近似值与概率 - 比例到大小的亚采样来有效地比较大规模模型。我们展示了如何根据所提供的ELPD估计值比较竞争模型,以及如何进行后验预测检查以保护一个正在考虑的模型中的过度自信。我们得出的结论是,该方法对数学心理学家具有吸引力,他们旨在比较几种竞争性统计模型,这些模型可能是高维和大数据制度。
Comparison of competing statistical models is an essential part of psychological research. From a Bayesian perspective, various approaches to model comparison and selection have been proposed in the literature. However, the applicability of these approaches strongly depends on the assumptions about the model space $\mathcal{M}$, the so-called model view. Furthermore, traditional methods like leave-one-out cross-validation (LOO-CV) estimate the expected log predictive density (ELPD) of a model to investigate how the model generalises out-of-sample, which quickly becomes computationally inefficient when sample size becomes large. Here, we provide a tutorial on approximate Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO), a computationally efficient method for Bayesian model comparison. First, we discuss several model views and the available Bayesian model comparison methods in each. We then use Bayesian logistic regression as a running example how to apply the method in practice, and show that it outperforms other methods like LOO-CV or information criteria in terms of computational effort while providing similarly accurate ELPD estimates. In a second step, we show how even large-scale models can be compared efficiently by using posterior approximations in combination with probability-proportional-to-size subsampling. We show how to compare competing models based on the ELPD estimates provided, and how to conduct posterior predictive checks to safeguard against overconfidence in one of the models under consideration. We conclude that the method is attractive for mathematical psychologists who aim at comparing several competing statistical models, which are possibly high-dimensional and in the big-data regime.