论文标题
无监督的基于学习的肺结节检测的变压器
Unsupervised Contrastive Learning based Transformer for Lung Nodule Detection
论文作者
论文摘要
用计算机断层扫描(CT)对肺结节的早期检测对于肺癌患者更长的生存和更好的生活质量至关重要。在这种情况下,计算机辅助检测/诊断(CAD)被证明是第二或并发的读者。但是,对于此类CAD系统甚至放射科医生的准确检测仍然是一个挑战,这不仅是由于肺结节的大小,位置和外观的变化,而且还具有肺结构的复杂性。这导致了CAD的高阳性速率,从而损害了其临床功效。在最近的计算机视觉技术的激励下,我们提出了一个基于自我监督区域的3D变压器模型,以识别一组候选区域之间的肺结节。具体而言,开发了一个3D视觉变压器(VIT),该变压器将CT图像体积分为一系列非重叠立方体,将每个立方体中嵌入带有嵌入层的提取物嵌入特征,并分析所有具有自我关注机制的嵌入特征。为了有效地在相对较小的数据集上训练变压器模型,使用基于区域的对比学习方法来通过使用公共CT图像预先训练3D变压器来提高性能。我们的实验表明,与常用的3D卷积神经网络相比,提出的方法可以显着改善肺结筛查的性能。
Early detection of lung nodules with computed tomography (CT) is critical for the longer survival of lung cancer patients and better quality of life. Computer-aided detection/diagnosis (CAD) is proven valuable as a second or concurrent reader in this context. However, accurate detection of lung nodules remains a challenge for such CAD systems and even radiologists due to not only the variability in size, location, and appearance of lung nodules but also the complexity of lung structures. This leads to a high false-positive rate with CAD, compromising its clinical efficacy. Motivated by recent computer vision techniques, here we present a self-supervised region-based 3D transformer model to identify lung nodules among a set of candidate regions. Specifically, a 3D vision transformer (ViT) is developed that divides a CT image volume into a sequence of non-overlap cubes, extracts embedding features from each cube with an embedding layer, and analyzes all embedding features with a self-attention mechanism for the prediction. To effectively train the transformer model on a relatively small dataset, the region-based contrastive learning method is used to boost the performance by pre-training the 3D transformer with public CT images. Our experiments show that the proposed method can significantly improve the performance of lung nodule screening in comparison with the commonly used 3D convolutional neural networks.