通过自我监督的代表学习进行多种功能视觉变压器，以改进COVID-19诊断

论文标题

通过自我监督的代表学习进行多种功能视觉变压器，以改进COVID-19诊断

Multi-Feature Vision Transformer via Self-Supervised Representation Learning for Improvement of COVID-19 Diagnosis

论文作者

Qi, Xiao, Foran, David J., Nosher, John L., Hacihaliloglu, Ilker

论文摘要

由于更具成本效益，可广泛可用，并且与CT相比，胸部X射线（CXR）成像的作用在COVID-19-19大流行期间已经演变而成。为了提高CXR成像的诊断性能，越来越多的研究研究了监督的深度学习方法是否可以提供额外的支持。但是，有监督的方法依靠大量标记的放射学图像，这是一项耗时且复杂的程序，需要专家临床医生的输入。由于COVID-19患者数据的相对稀缺性和昂贵的标签过程，自我监督的学习方法已经获得了动力，并已提出与完全监督的学习方法相当的结果。在这项工作中，我们研究了从CXR图像诊断COVID-19疾病的背景下，自我监管学习的有效性。我们提出了一个多功能视觉变压器（VIT）引导的体系结构，在该体系结构中我们部署了交叉注意机制，以从原始CXR图像和相应增强的局部相CXR图像中学习信息。我们通过利用基于局部阶段的增强的CXR图像来进一步改善基线自学学习模型的性能。 By using 10\% labeled CXR scans, the proposed model achieves 91.10\% and 96.21\% overall accuracy tested on total 35,483 CXR images of healthy (8,851), regular pneumonia (6,045), and COVID-19 (18,159) scans and shows significant improvement over state-of-the-art techniques.代码可用https://github.com/endiqq/multi-feature-vit

The role of chest X-ray (CXR) imaging, due to being more cost-effective, widely available, and having a faster acquisition time compared to CT, has evolved during the COVID-19 pandemic. To improve the diagnostic performance of CXR imaging a growing number of studies have investigated whether supervised deep learning methods can provide additional support. However, supervised methods rely on a large number of labeled radiology images, which is a time-consuming and complex procedure requiring expert clinician input. Due to the relative scarcity of COVID-19 patient data and the costly labeling process, self-supervised learning methods have gained momentum and has been proposed achieving comparable results to fully supervised learning approaches. In this work, we study the effectiveness of self-supervised learning in the context of diagnosing COVID-19 disease from CXR images. We propose a multi-feature Vision Transformer (ViT) guided architecture where we deploy a cross-attention mechanism to learn information from both original CXR images and corresponding enhanced local phase CXR images. We demonstrate the performance of the baseline self-supervised learning models can be further improved by leveraging the local phase-based enhanced CXR images. By using 10\% labeled CXR scans, the proposed model achieves 91.10\% and 96.21\% overall accuracy tested on total 35,483 CXR images of healthy (8,851), regular pneumonia (6,045), and COVID-19 (18,159) scans and shows significant improvement over state-of-the-art techniques. Code is available https://github.com/endiqq/Multi-Feature-ViT

下载PDF全文

下载文献需遵守相关版权规定

论文标题