论文标题
大约:可理解的在线系统,以支持基于胸部X射线的Covid-19诊断
CIRCA: comprehensible online system in support of chest X-rays-based COVID-19 diagnosis
论文作者
论文摘要
由于需要住院的患者大量积累,因此即使在发达国家,COVID-19-19造成了卫生系统的高度超载。基于医学成像数据的深度学习技术可以帮助更快地检测COVID-19病例并监测疾病进展。不管众多针对肺X射线的拟议解决方案,它们都不是可以在诊所中使用的产品。使用五个不同的数据集(使用Polcovid,Aiforcovid,Covidx,NIH和人为生成的数据)来构建23 799 CXR的代表性数据集用于模型培训; 1 050图像用作保持测试集,44 247用作独立测试集(BIMCV数据库)。开发了基于U-NET的模型来识别CXR的临床相关区域。每个图像类别(正常,肺炎和Covid-19)使用2D高斯混合模型将3个亚型分为3个亚型。决策树被用于基于处理的CXR和放射线特征的密集神经网络从InceptionV3网络中汇总预测。肺部分割模型在验证数据集中给出了Sorensen-DICE系数的94.86%,在测试数据集中获得了93.36%。在5倍的交叉验证中,所有类别的准确性范围从91%到93%不等,保持比灵敏度和NPV高的特异性略高于PPV。在保持测试集中,平衡的准确性在68%至100%之间。亚型N1,P1和C1获得了最高的性能。对于正常和COVID-19类亚型的独立数据集,也获得了类似的性能。在没有疾病迹象的情况下,放射科医生注释了76%的199%的199例患者被错误地分类为正常病例。最后,我们开发了在线服务(https://circa.aei.polsl.pl),以提供快速诊断支持工具的访问权限。
Due to the large accumulation of patients requiring hospitalization, the COVID-19 pandemic disease caused a high overload of health systems, even in developed countries. Deep learning techniques based on medical imaging data can help in the faster detection of COVID-19 cases and monitoring of disease progression. Regardless of the numerous proposed solutions for lung X-rays, none of them is a product that can be used in the clinic. Five different datasets (POLCOVID, AIforCOVID, COVIDx, NIH, and artificially generated data) were used to construct a representative dataset of 23 799 CXRs for model training; 1 050 images were used as a hold-out test set, and 44 247 as independent test set (BIMCV database). A U-Net-based model was developed to identify a clinically relevant region of the CXR. Each image class (normal, pneumonia, and COVID-19) was divided into 3 subtypes using a 2D Gaussian mixture model. A decision tree was used to aggregate predictions from the InceptionV3 network based on processed CXRs and a dense neural network on radiomic features. The lung segmentation model gave the Sorensen-Dice coefficient of 94.86% in the validation dataset, and 93.36% in the testing dataset. In 5-fold cross-validation, the accuracy for all classes ranged from 91% to 93%, keeping slightly higher specificity than sensitivity and NPV than PPV. In the hold-out test set, the balanced accuracy ranged between 68% and 100%. The highest performance was obtained for the subtypes N1, P1, and C1. A similar performance was obtained on the independent dataset for normal and COVID-19 class subtypes. Seventy-six percent of COVID-19 patients wrongly classified as normal cases were annotated by radiologists as with no signs of disease. Finally, we developed the online service (https://circa.aei.polsl.pl) to provide access to fast diagnosis support tools.