论文标题

因子增强危险回归的正规化模型

Factor-Augmented Regularized Model for Hazard Regression

论文作者

Bayle, Pierre, Fan, Jianqing

论文摘要

高维数据的一个普遍特征是协变量之间的依赖性,当协变量高度相关时,模型选择是具有挑战性的。为了在存在与因子结构相关的协变量的情况下对高维COX比例危害模型进行模型选择,我们提出了一种新模型,因子增强了危险回归(FARMHAZARD)的正则定期化模型,该模型基于驱动协变量依赖性的潜在因素,并扩展了Cox的模型。该新模型生成的过程通过学习因素和高度协变量向量的学习因素和特质组件分为两个步骤,然后将其用作新预测因子。 Cox的模型是一种广泛使用的半参数模型,用于生存分析,审查数据和时间依赖的协变量带来了其他技术挑战。我们证明了模型选择的一致性和在轻度条件下的估计一致性。我们还开发了一个因子增强因素的可变筛选程序,以处理超高维问题中的牢固相关性。广泛的仿真和实际数据实验表明,与替代方法相比,我们的程序享有良好的性能,并在模型选择,样本外C指数和筛选方面取得更好的结果。

A prevalent feature of high-dimensional data is the dependence among covariates, and model selection is known to be challenging when covariates are highly correlated. To perform model selection for the high-dimensional Cox proportional hazards model in presence of correlated covariates with factor structure, we propose a new model, Factor-Augmented Regularized Model for Hazard Regression (FarmHazard), which builds upon latent factors that drive covariate dependence and extends Cox's model. This new model generates procedures that operate in two steps by learning factors and idiosyncratic components from high-dimensional covariate vectors and then using them as new predictors. Cox's model is a widely used semi-parametric model for survival analysis, where censored data and time-dependent covariates bring additional technical challenges. We prove model selection consistency and estimation consistency under mild conditions. We also develop a factor-augmented variable screening procedure to deal with strong correlations in ultra-high dimensional problems. Extensive simulations and real data experiments demonstrate that our procedures enjoy good performance and achieve better results on model selection, out-of-sample C-index and screening than alternative methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源