论文标题
深度多路径网络整合了不完整的生物标志物和胸部CT数据,以评估肺癌风险
Deep Multi-path Network Integrating Incomplete Biomarker and Chest CT Data for Evaluating Lung Cancer Risk
论文作者
论文摘要
临床数据元素(CDE)(例如,年龄,吸烟史),血液标记和胸部计算机断层扫描(CT)结构特征被认为是评估肺癌风险的有效手段。这些自变量可以提供互补信息,我们假设将它们结合起来将提高预测准确性。实际上,并非所有患者都有所有这些变量可用。在本文中,我们提出了一种新的网络设计,称为多路径多模式缺失网络(M3NET),以集成多模式数据(即CDE,生物标志物和CT Image),以考虑缺失模态与多个路径神经网络。每条路径都学习一种模态的判别特征,并且在第二阶段融合了不同的方式进行集成预测。可以通过医学图像功能和CDE/生物标志物端对端训练该网络,也可以通过单一模式进行预测。我们使用数据集评估M3NET,包括来自联盟的三个位点,用于分子和细胞筛选病变(MCL)项目的分子和细胞表征。我们的方法在1291名受试者的队列中进行了交叉验证(383名具有完整的CDE/生物标志物和CT图像的受试者),并用99名受试者组成(99个具有完整的CDES/生物标志物和CT图像)进行外部验证。交叉验证和外部验证结果均表明,组合多种形态可显着提高单个模态的预测性能。结果表明,与缺少CDE/生物标志物或CT成像功能的主体整合可以有助于我们模型的歧视能力(P <0.05,Bootstrap两尾测试)。总而言之,提出的M3NET框架提供了一种在缺失信息的背景下整合图像和非图像数据的有效方法。
Clinical data elements (CDEs) (e.g., age, smoking history), blood markers and chest computed tomography (CT) structural features have been regarded as effective means for assessing lung cancer risk. These independent variables can provide complementary information and we hypothesize that combining them will improve the prediction accuracy. In practice, not all patients have all these variables available. In this paper, we propose a new network design, termed as multi-path multi-modal missing network (M3Net), to integrate the multi-modal data (i.e., CDEs, biomarker and CT image) considering missing modality with multiple paths neural network. Each path learns discriminative features of one modality, and different modalities are fused in a second stage for an integrated prediction. The network can be trained end-to-end with both medical image features and CDEs/biomarkers, or make a prediction with single modality. We evaluate M3Net with datasets including three sites from the Consortium for Molecular and Cellular Characterization of Screen-Detected Lesions (MCL) project. Our method is cross validated within a cohort of 1291 subjects (383 subjects with complete CDEs/biomarkers and CT images), and externally validated with a cohort of 99 subjects (99 with complete CDEs/biomarkers and CT images). Both cross-validation and external-validation results show that combining multiple modality significantly improves the predicting performance of single modality. The results suggest that integrating subjects with missing either CDEs/biomarker or CT imaging features can contribute to the discriminatory power of our model (p < 0.05, bootstrap two-tailed test). In summary, the proposed M3Net framework provides an effective way to integrate image and non-image data in the context of missing information.