论文标题
UltimatePólyaGamma采样器 - 有效的MCMC,用于可能不平衡的算法和分类数据
Ultimate Pólya Gamma Samplers -- Efficient MCMC for possibly imbalanced binary and categorical data
论文作者
论文摘要
建模二进制和分类数据是最常见的应用统计学家和计量经济学家的任务之一。尽管在这种情况下已有数十年的贝叶斯方法,但它们通常需要对贝叶斯统计数据高水平,或者遭受诸如低采样效率之类的问题。为了促进贝叶斯模型用于二进制和分类数据的可访问性,我们为一系列常见的逻辑回归模型介绍了基于pólya-gamma随机变量的新型潜在变量表示。从这些潜在变量表示中,得出了用于二进制,二项式和多项式logit模型的新Gibbs采样算法。所有模型都允许有条件地进行高斯的可能性表示,从而使更复杂的建模框架(例如状态空间模型)呈现扩展。但是,在这些基于数据的基于数据的估计框架中,采样效率仍然可能是一个问题。为了抵消这一点,开发和讨论了新型的边缘数据增强策略。通过广泛的模拟和实际数据应用程序来说明我们方法的优点。
Modeling binary and categorical data is one of the most commonly encountered tasks of applied statisticians and econometricians. While Bayesian methods in this context have been available for decades now, they often require a high level of familiarity with Bayesian statistics or suffer from issues such as low sampling efficiency. To contribute to the accessibility of Bayesian models for binary and categorical data, we introduce novel latent variable representations based on Pólya-Gamma random variables for a range of commonly encountered logistic regression models. From these latent variable representations, new Gibbs sampling algorithms for binary, binomial, and multinomial logit models are derived. All models allow for a conditionally Gaussian likelihood representation, rendering extensions to more complex modeling frameworks such as state space models straightforward. However, sampling efficiency may still be an issue in these data augmentation based estimation frameworks. To counteract this, novel marginal data augmentation strategies are developed and discussed in detail. The merits of our approach are illustrated through extensive simulations and real data applications.