论文标题
使用非本地分布的混合物与基因组研究应用的多种假设筛选
Multiple hypothesis screening using mixtures of non-local distributions with applications to genomic studies
论文作者
论文摘要
大规模数据集的分析,尤其是在生物医学环境中,经常涉及对多种假设进行原则筛选。著名的两组模型共同对测试统计数据的分布进行了建模,其中两个竞争密度的混合物,即无效分布和替代分布。我们研究了加权密度,尤其是非本地密度作为工作替代分布的使用,以实施与空分离,从而完善筛选程序。我们展示了这些加权替代方案如何改善各种操作特征,例如贝叶斯错误发现率,该测试对局部,未加权的可能性方法进行了固定混合物的比例。提出了参数和非参数模型规范,以及有效的后解释。通过模拟研究,我们展示了我们的模型如何与各种操作特征相比,与良好的和最新的替代方案进行比较。最后,为了说明我们方法的多功能性,我们通过从异质性质的基因组研究中进行了三种差异表达分析。
The analysis of large-scale datasets, especially in biomedical contexts, frequently involves a principled screening of multiple hypotheses. The celebrated two-group model jointly models the distribution of the test statistics with mixtures of two competing densities, the null and the alternative distributions. We investigate the use of weighted densities and, in particular, non-local densities as working alternative distributions, to enforce separation from the null and thus refine the screening procedure. We show how these weighted alternatives improve various operating characteristics, such as the Bayesian False Discovery rate, of the resulting tests for a fixed mixture proportion with respect to a local, unweighted likelihood approach. Parametric and nonparametric model specifications are proposed, along with efficient samplers for posterior inference. By means of a simulation study, we exhibit how our model compares with both well-established and state-of-the-art alternatives in terms of various operating characteristics. Finally, to illustrate the versatility of our method, we conduct three differential expression analyses with publicly-available datasets from genomic studies of heterogeneous nature.