论文标题

使用胸部X射线射线照片的Covid-19分类任务的可推广人工智能模型:在四个临床数据集上进行评估,患者有15,097例

A Generalizable Artificial Intelligence Model for COVID-19 Classification Task Using Chest X-ray Radiographs: Evaluated Over Four Clinical Datasets with 15,097 Patients

论文作者

Zhang, Ran, Tie, Xin, Garrett, John W., Griner, Dalton, Qi, Zhihua, Bevins, Nicholas B., Reeder, Scott B., Chen, Guang-Hong

论文摘要

目的:回答一个长期存在的问题,即是否可以将从单个临床部位训练的模型推广到外部部位。 材料和方法:从3,264例CoVID-19阳性患者和4,802 COVID-19-19-COVID-19-COVID患者中收集了17,537个胸部X射线射线照相仪(CXR),从单个部位收集了AI模型开发。回顾性地评估了训练模型的普遍性,使用四个不同的现实世界临床数据集,总计15097例患者(3,277名Covid-19-19-199阳性患者)总共26,633 CXR。接收器操作特征曲线(AUC)下的面积用于评估诊断性能。 结果:使用单源临床数据集训练的AI模型在应用于内部时间测试集时,AUC的AUC为0.82(95%CI:0.80,0.84)。当将两个外部临床部位的数据集应用于数据集时,AUC为0.81(95%CI:0.80,0.82)和0.82(95%CI:0.80,0.84)。当应用于医学成像和数据资源中心(MIDRC)收集的多机构COVID-19数据集时,可实现0.79(95%CI:0.77,0.81)的AUC(95%CI:0.77,0.81)。幂律依赖性n^(k)(k被经验发现为-0.21至-0.25),表明对训练数据大小的性能依赖相对较弱。 结论:使用来自单个临床部位的良好曲线数据训练的COVID-19分类AI模型可以推广到外部临床部位,而性能显着下降。

Purpose: To answer the long-standing question of whether a model trained from a single clinical site can be generalized to external sites. Materials and Methods: 17,537 chest x-ray radiographs (CXRs) from 3,264 COVID-19-positive patients and 4,802 COVID-19-negative patients were collected from a single site for AI model development. The generalizability of the trained model was retrospectively evaluated using four different real-world clinical datasets with a total of 26,633 CXRs from 15,097 patients (3,277 COVID-19-positive patients). The area under the receiver operating characteristic curve (AUC) was used to assess diagnostic performance. Results: The AI model trained using a single-source clinical dataset achieved an AUC of 0.82 (95% CI: 0.80, 0.84) when applied to the internal temporal test set. When applied to datasets from two external clinical sites, an AUC of 0.81 (95% CI: 0.80, 0.82) and 0.82 (95% CI: 0.80, 0.84) were achieved. An AUC of 0.79 (95% CI: 0.77, 0.81) was achieved when applied to a multi-institutional COVID-19 dataset collected by the Medical Imaging and Data Resource Center (MIDRC). A power-law dependence, N^(k )(k is empirically found to be -0.21 to -0.25), indicates a relatively weak performance dependence on the training data sizes. Conclusion: COVID-19 classification AI model trained using well-curated data from a single clinical site is generalizable to external clinical sites without a significant drop in performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源