I阶段I非小细胞肺癌分层，使用基于模型的聚类算法与协变量

论文标题

I阶段I非小细胞肺癌分层，使用基于模型的聚类算法与协变量

Stage I non-small cell lung cancer stratification by using a model-based clustering algorithm with covariates

论文作者

Relvas, Carlos, Fujita, André

论文摘要

目前，肺癌是癌症死亡的主要原因。在各种亚型中，诊断为I期非小细胞肺癌（NSCLC），尤其是腺癌的患者数量一直在增加。据估计，I期患者的30-40 \％将复发，10-30 \％会因复发而死亡，这显然表明存在可以从其他治疗中受益的亚组。我们假设目前试图识别I阶段NSCLC亚组由于协变量效应而失败，例如诊断和分化的年龄，这可能掩盖了结果。在这种情况下，为了对I级NSCLC进行分层，我们提出了CEM-CO，CEM-CO是一种基于模型的聚类算法，可在聚类过程中删除/最小化不良协变量的影响。我们将CEM-CO应用于由诊断为I级NSCLC的129名受试者组成的基因表达数据集，并成功地鉴定出具有明显不同表型（预后不良）的亚组，而标准聚类算法失败。

Lung cancer is currently the leading cause of cancer deaths. Among various subtypes, the number of patients diagnosed with stage I non-small cell lung cancer (NSCLC), particularly adenocarcinoma, has been increasing. It is estimated that 30 - 40\% of stage I patients will relapse, and 10 - 30\% will die due to recurrence, clearly suggesting the presence of a subgroup that could be benefited by additional therapy. We hypothesize that current attempts to identify stage I NSCLC subgroup failed due to covariate effects, such as the age at diagnosis and differentiation, which may be masking the results. In this context, to stratify stage I NSCLC, we propose CEM-Co, a model-based clustering algorithm that removes/minimizes the effects of undesirable covariates during the clustering process. We applied CEM-Co on a gene expression data set composed of 129 subjects diagnosed with stage I NSCLC and successfully identified a subgroup with a significantly different phenotype (poor prognosis), while standard clustering algorithms failed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题