论文标题
整合多摩智数据以进行患者生存预测的策略
Strategies to integrate multi-omics data for patient survival prediction
论文作者
论文摘要
基因组学,尤其是多词,使精确的医学可行。癌症基因组图集(TCGA)等具有临床结果的完整和可公开可访问的多词资源是开发整合多摩管数据以预测患者癌症表型的计算方法的绝佳测试床。我们一直在利用TCGA多词数据数据来预测癌症患者的生存,并使用各种方法,包括先前的生物学知识(例如途径),以及最近的深度学习方法。随着时间的流逝,我们开发了Cox-NNET,Deepprog和两阶段COX-NNET等方法,以应对由于多摩管和多模式而引起的挑战。尽管训练数据集中的样本量有限(数百至数千)以及人类种群的异质性性质,但这些方法在预测独立人群中的患者生存方面表现出显着性和鲁棒性。在下文中,我们将详细描述这些方法,建模结果以及这些方法所揭示的重要生物学见解。
Genomics, especially multi-omics, has made precision medicine feasible. The completion and publicly accessible multi-omics resource with clinical outcome, such as The Cancer Genome Atlas (TCGA) is a great test bed for developing computational methods that integrate multi-omics data to predict patient cancer phenotypes. We have been utilizing TCGA multi-omics data to predict cancer patient survival, using a variety of approaches, including prior-biological knowledge (such as pathways), and more recently, deep-learning methods. Over time, we have developed methods such as Cox-nnet, DeepProg, and two-stage Cox-nnet, to address the challenges due to multi-omics and multi-modality. Despite the limited sample size (hundreds to thousands) in the training datasets as well as the heterogeneity nature of human populations, these methods have shown significance and robustness at predicting patient survival in independent population cohorts. In the following, we would describe in detail these methodologies, the modeling results, and important biological insights revealed by these methods.