SPCXR：使用胸部X射线对域特异性基础模型进行自我监督的预处理

论文标题

SPCXR：使用胸部X射线对域特异性基础模型进行自我监督的预处理

SPCXR: Self-supervised Pretraining using Chest X-rays Towards a Domain Specific Foundation Model

论文作者

Anwar, Syed Muhammad, Parida, Abhijeet, Atito, Sara, Awais, Muhammad, Nino, Gustavo, Kitler, Josef, Linguraru, Marius George

论文摘要

胸部X射线（CXR）是一种广泛使用的成像方式，用于诊断和预后。图像分析任务各不相同。例子包括病理检测和肺部分割。有大量的工作，可以为特定任务开发机器学习算法。最近一个重要的例子是使用CXR数据检测冠状病毒病（COVID-19）。但是，基于监督学习的传统诊断工具设计方法的负担是提供培训数据注释的需求，这对于更好的临床结果应该具有良好的质量。在这里，我们提出了一种替代解决方案，即一种新的自我监督范式，其中使用组掩盖的自我监督框架来学习CXRS的一般表示。然后，针对特定领域特定的任务（例如Covid-19，肺炎检测和一般健康筛查）进行了微调的预训练模型。我们表明，可以将相同的预训练用于肺部分割任务。我们提出的范式在多个下游任务中表现出了强劲的表现，这表明了预训练的成功。此外，在测试时间期间，预训练模型的性能在具有显着漂移的数据上证明了学习更好的通用表示。在独特的小儿科数据集中，通过Covid-19检测进一步验证了该方法。与基于有监督的变压器的方法相比，准确性的性能提高（约25％）是显着的。这增加了我们提出的框架和培训策略的力量和可靠性。

Chest X-rays (CXRs) are a widely used imaging modality for the diagnosis and prognosis of lung disease. The image analysis tasks vary. Examples include pathology detection and lung segmentation. There is a large body of work where machine learning algorithms are developed for specific tasks. A significant recent example is Coronavirus disease (covid-19) detection using CXR data. However, the traditional diagnostic tool design methods based on supervised learning are burdened by the need to provide training data annotation, which should be of good quality for better clinical outcomes. Here, we propose an alternative solution, a new self-supervised paradigm, where a general representation from CXRs is learned using a group-masked self-supervised framework. The pre-trained model is then fine-tuned for domain-specific tasks such as covid-19, pneumonia detection, and general health screening. We show that the same pre-training can be used for the lung segmentation task. Our proposed paradigm shows robust performance in multiple downstream tasks which demonstrates the success of the pre-training. Moreover, the performance of the pre-trained models on data with significant drift during test time proves the learning of a better generic representation. The methods are further validated by covid-19 detection in a unique small-scale pediatric data set. The performance gain in accuracy (~25%) is significant when compared to a supervised transformer-based method. This adds credence to the strength and reliability of our proposed framework and pre-training strategy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题