论文标题
强大的稀疏贝叶斯无限因子模型
Robust Sparse Bayesian Infinite Factor Models
论文作者
论文摘要
贝叶斯因素模型的大多数工作和应用都假设了正常的可能性,而不管其有效性如何。我们提出了一个基于多元学生的重型高维数据的贝叶斯因素模型,以获取更好的协方差估计。我们使用Bhattacharya&Dunson提出的乘法性伽马过程收缩和因子数量适应方案[Biometrika(2011)291-306]。由于提出的模型的幼稚吉布斯采样器会遭受缓慢的混合,因此我们提出了一种马尔可夫链蒙特卡洛算法,其中利用哈密顿蒙特卡洛的快速混合在提议的模型中用于某些参数。仿真结果说明了重尾高维数据的协方差估计的增长。我们还提供了一个理论上的结果,即在合理条件下,所提出的模型的后部是弱一致的。鉴于癌细胞的DNA签名数据,我们在乳腺癌转移预测上的应用将提议的因子模型应用于本文。
Most of previous works and applications of Bayesian factor model have assumed the normal likelihood regardless of its validity. We propose a Bayesian factor model for heavy-tailed high-dimensional data based on multivariate Student-$t$ likelihood to obtain better covariance estimation. We use multiplicative gamma process shrinkage prior and factor number adaptation scheme proposed in Bhattacharya & Dunson [Biometrika (2011) 291-306]. Since a naive Gibbs sampler for the proposed model suffers from slow mixing, we propose a Markov Chain Monte Carlo algorithm where fast mixing of Hamiltonian Monte Carlo is exploited for some parameters in proposed model. Simulation results illustrate the gain in performance of covariance estimation for heavy-tailed high-dimensional data. We also provide a theoretical result that the posterior of the proposed model is weakly consistent under reasonable conditions. We conclude the paper with the application of proposed factor model on breast cancer metastasis prediction given DNA signature data of cancer cell.