论文标题
X射线血管造影中弱监督的血管分割,通过从嘈杂的标签中进行自定进度的学习
Weakly Supervised Vessel Segmentation in X-ray Angiograms by Self-Paced Learning from Noisy Labels with Suggestive Annotation
论文作者
论文摘要
通过卷积神经网络(CNN)对X射线血管造影中冠状动脉的分割有望但受到精确注释大量训练图像中所有像素的要求而受到限制,这对于复杂的冠状树来说是极其劳动密集型的。为了减轻注释者的负担,我们提出了一个新颖的弱监督训练框架,该框架从自动船只增强的嘈杂伪标签中学习,而不是通过完全手动注释获得的准确标签。典型的自定进度学习方案用于使训练过程与标签噪声稳健,同时受到伪标签的系统偏见的挑战,从而导致CNN在测试时的性能下降。为了解决这个问题,我们提出了一个注释的自定进度学习框架(AR-SPL),以使用暗示性注释来纠正潜在错误。还提出了详尽的模型 - 杂质不确定性估计,以使提示注释的最小注释成本不仅基于训练中的CNN,而且基于直接从原始数据得出的冠状动脉的几何特征。实验表明,我们提出的框架达到了1)与完全监督的学习相当的精度,这也大大优于其他弱监督的学习框架; 2)大量降低注释成本,即节省注释时间的75.18%,只有3.46%的图像区域需要注释; 3)有效的干预过程,导致卓越的性能,而手动相互作用更少。
The segmentation of coronary arteries in X-ray angiograms by convolutional neural networks (CNNs) is promising yet limited by the requirement of precisely annotating all pixels in a large number of training images, which is extremely labor-intensive especially for complex coronary trees. To alleviate the burden on the annotator, we propose a novel weakly supervised training framework that learns from noisy pseudo labels generated from automatic vessel enhancement, rather than accurate labels obtained by fully manual annotation. A typical self-paced learning scheme is used to make the training process robust against label noise while challenged by the systematic biases in pseudo labels, thus leading to the decreased performance of CNNs at test time. To solve this problem, we propose an annotation-refining self-paced learning framework (AR-SPL) to correct the potential errors using suggestive annotation. An elaborate model-vesselness uncertainty estimation is also proposed to enable the minimal annotation cost for suggestive annotation, based on not only the CNNs in training but also the geometric features of coronary arteries derived directly from raw data. Experiments show that our proposed framework achieves 1) comparable accuracy to fully supervised learning, which also significantly outperforms other weakly supervised learning frameworks; 2) largely reduced annotation cost, i.e., 75.18% of annotation time is saved, and only 3.46% of image regions are required to be annotated; and 3) an efficient intervention process, leading to superior performance with even fewer manual interactions.