论文标题

使用随机投影的近似贝叶斯因子高维manova

An approximate Bayes factor based high dimensional MANOVA using Random Projections

论文作者

Zoh, Roger S, Xie, Fangzheng

论文摘要

两个或多个组的高维平均矢量测试问题仍然是一个非常活跃的研究领域。在这些情况下,传统测试不适用,因为它们涉及等级不足的组协方差矩阵的反转。在当前方法中,通过简单地查看测试,假设稀疏或对角线协方差矩阵可能会忽略特征之间的复杂依赖性,从而解决了此问题。在本文中,我们开发了一个基于贝叶斯因子(BF)的测试程序,用于比较(非常)高维设置中的两个或更多种群均值。考虑了基于随机投影(RP)方法的两个版本的基于贝叶斯因子的测试统计。 RPS很有吸引力,因为它们不对数据中特征的依赖性形式进行假设。最终测试统计量基于贝叶斯因子的集合,该因子对应于随机投影数据的多个复制。通过一系列模拟设置比较了两个提出的测试统计数据。最后,它们应用于公开可用的基因组单细胞RNA-SEQ(SCRNA-SEQ)数据集的分析。

High-dimensional mean vector testing problem for two or more groups remain a very active research area. In these setting, traditional tests are not applicable because they involve the inversion of rank deficient group covariance matrix. In current approaches, this problem is addressed by simply looking at a test assuming a sparse or diagonal covariance matrix potentially ignoring complex dependency between features. In this paper, we develop a Bayes factor (BF) based testing procedure for comparing two or more population means in (very) high dimensional settings. Two versions of the Bayes factor based test statistics are considered which are based on a Random projection (RP) approach. RPs are appealing since they make not assumption about the form of the dependency across features in the data. The final test statistic is based on an ensemble of Bayes factors corresponding to multiple replications of randomly projected data. Both proposed test statistics are compared through a battery of simulation settings. Finally they are applied to the analysis of a publicly available genomic single cell RNA-seq (scRNA-seq) dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源