论文标题
使用GLM-PO2PLS的结果变量和集成的OMIC数据集的联合建模
Joint Modeling of An Outcome Variable and Integrated Omic Datasets Using GLM-PO2PLS
论文作者
论文摘要
在许多人类疾病的研究中,测量了多个OMIC数据集。通常,对这些OMIC数据集进行了一个疾病研究,因此OMICS之间的关系被忽略了。对多种OMIC的联合部分进行建模及其与结果疾病的关联,将为疾病的复杂分子基础提供见解。在本文中,我们扩展了缩小方法,该方法将OMICS的联合部分建模到一种新方法,该方法将与OMICS共同建模结果变量。我们建立模型可识别性并开发EM算法,以获得正常和Bernoulli分布结果的参数的最大似然估计器。提出了测试统计数据来推断结果与OMIC之间的关联,并得出了它们的渐近分布。进行了广泛的仿真研究以评估所提出的模型。该模型通过唐氏综合症研究进行了说明,其中唐氏综合症和两个OMICS(甲基化和糖基因)共同建模。
In many studies of human diseases, multiple omic datasets are measured. Typically, these omic datasets are studied one by one with the disease, thus the relationship between omics are overlooked. Modeling the joint part of multiple omics and its association to the outcome disease will provide insights into the complex molecular base of the disease. In this article, we extend dimension reduction methods which model the joint part of omics to a novel method that jointly models an outcome variable with omics. We establish the model identifiability and develop EM algorithms to obtain maximum likelihood estimators of the parameters for normally and Bernoulli distributed outcomes. Test statistics are proposed to infer the association between the outcome and omics, and their asymptotic distributions are derived. Extensive simulation studies are conducted to evaluate the proposed model. The model is illustrated by a Down syndrome study where Down syndrome and two omics - methylation and glycomics - are jointly modeled.