论文标题

高维推断的马尔可夫邻居回归

Markov Neighborhood Regression for High-Dimensional Inference

论文作者

Liang, Faming, Xue, Jingnan, Jia, Bochao

论文摘要

本文提出了一种创新方法,用于构建置信区间并评估高维线性模型的统计推断中的p值。所提出的方法已成功将高维的推理问题分为一系列低维的推理问题:对于每个回归系数$β_I$,置信区间和$ p $ - 值是通过根据相应变量$ x_i $和其他可变量和其他变量之间的条件独立关系的子集进行回归来计算的。由于变量的子集由所有变量$ x_1,x_2,\ ldots,x_p $组成的马尔可夫网络中的马尔可夫邻居$ x_i $,因此提出的方法将其视为马尔可夫邻域回归。提出的方法在高维线性,逻辑和COX回归上进行了测试。数值结果表明,所提出的方法显着优于现有方法。基于马尔可夫邻居的回归,提出了一种学习高维线性模型的因果结构的方法,并应用于鉴定药物敏感基因和癌症驱动基因的方法。使用条件独立关系减少维度的想法是一般的,并且可能还可以扩展到其他高维或大数据问题。

This paper proposes an innovative method for constructing confidence intervals and assessing p-values in statistical inference for high-dimensional linear models. The proposed method has successfully broken the high-dimensional inference problem into a series of low-dimensional inference problems: For each regression coefficient $β_i$, the confidence interval and $p$-value are computed by regressing on a subset of variables selected according to the conditional independence relations between the corresponding variable $X_i$ and other variables. Since the subset of variables forms a Markov neighborhood of $X_i$ in the Markov network formed by all the variables $X_1,X_2,\ldots,X_p$, the proposed method is coined as Markov neighborhood regression. The proposed method is tested on high-dimensional linear, logistic and Cox regression. The numerical results indicate that the proposed method significantly outperforms the existing ones. Based on the Markov neighborhood regression, a method of learning causal structures for high-dimensional linear models is proposed and applied to identification of drug sensitive genes and cancer driver genes. The idea of using conditional independence relations for dimension reduction is general and potentially can be extended to other high-dimensional or big data problems as well.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源