论文标题
使用变异推理对非巢二项式层次模型的快速准确估计
Fast and Accurate Estimation of Non-Nested Binomial Hierarchical Models Using Variational Inference
论文作者
论文摘要
非线性分层模型通常在许多学科中使用。但是,在存在非巢效应和大型数据集的情况下的推论是具有挑战性的,计算繁重的。本文为可扩展和准确的推断提供了两种贡献。首先,我得出了一种新的平均场变异算法,用于估计具有任意数量的非巢随机效应的二项式逻辑层次模型。其次,我提出了“少量增强的变分贝叶斯”(MAVB),该贝叶斯后处理进一步改善了初始近似。我证明,MAVB可以保证以低计算成本以低计算成本的近似质量改善,并诱发了初始分解假设所假定的依赖性。 我将这些技术应用于对选民行为的研究,该研究使用流行的多级回归和分层(MRP)的高维应用(MRP)。现有的估计需要数小时,而算法则在几分钟内进行。即使在强大的分解假设下,后部平均值也被彻底恢复。应用MAVB进一步通过部分纠正低估的方差来改善近似值。提出的方法是在开源软件包中实现的。
Non-linear hierarchical models are commonly used in many disciplines. However, inference in the presence of non-nested effects and on large datasets is challenging and computationally burdensome. This paper provides two contributions to scalable and accurate inference. First, I derive a new mean-field variational algorithm for estimating binomial logistic hierarchical models with an arbitrary number of non-nested random effects. Second, I propose "marginally augmented variational Bayes" (MAVB) that further improves the initial approximation through a step of Bayesian post-processing. I prove that MAVB provides a guaranteed improvement in the approximation quality at low computational cost and induces dependencies that were assumed away by the initial factorization assumptions. I apply these techniques to a study of voter behavior using a high-dimensional application of the popular approach of multilevel regression and post-stratification (MRP). Existing estimation took hours whereas the algorithms proposed run in minutes. The posterior means are well-recovered even under strong factorization assumptions. Applying MAVB further improves the approximation by partially correcting the under-estimated variance. The proposed methodology is implemented in an open source software package.