Min-Max的相似性：一个对比的半监督深度学习网络，用于手术工具分割

论文标题

Min-Max的相似性：一个对比的半监督深度学习网络，用于手术工具分割

Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools Segmentation

论文作者

Lou, Ange, Tawfik, Kareem, Yao, Xing, Liu, Ziteng, Noble, Jack

论文摘要

使用神经网络分割医学图像的一个常见问题是，很难获得大量像素级注释数据进行培训。为了解决这个问题，我们提出了一个基于对比度学习的半监督分割网络。与以前的最新技术相反，我们引入了Min-Max相似性（MMS），这是一种对比度学习形式的双视图培训形式，通过使用分类器和投影仪分别构建全负和积极和负面特征对，以将学习作为解决MMS问题来提出学习。全负对用于监督网络从不同观点学习并捕获一般特征，而未标记的预测的一致性是通过像素对比对比度和负面对之间的对比度损失来衡量的。为了对我们提出的方法进行定量和定性评估，我们在四个公共内窥镜外科手术工具分割数据集和一个人工耳蜗植入手术数据集上对其进行了测试，我们将其手动注释。结果表明，我们提出的方法始终优于最先进的半监督和完全监督的分割算法。我们的半监督分割算法可以成功识别未知的手术工具并提供良好的预测。同样，我们的MMS方法可以达到每秒约40帧（FPS）的推理速度，并且适合处理实时视频细分。

A common problem with segmentation of medical images using neural networks is the difficulty to obtain a significant number of pixel-level annotated data for training. To address this issue, we proposed a semi-supervised segmentation network based on contrastive learning. In contrast to the previous state-of-the-art, we introduce Min-Max Similarity (MMS), a contrastive learning form of dual-view training by employing classifiers and projectors to build all-negative, and positive and negative feature pairs, respectively, to formulate the learning as solving a MMS problem. The all-negative pairs are used to supervise the networks learning from different views and to capture general features, and the consistency of unlabeled predictions is measured by pixel-wise contrastive loss between positive and negative pairs. To quantitatively and qualitatively evaluate our proposed method, we test it on four public endoscopy surgical tool segmentation datasets and one cochlear implant surgery dataset, which we manually annotated. Results indicate that our proposed method consistently outperforms state-of-the-art semi-supervised and fully supervised segmentation algorithms. And our semi-supervised segmentation algorithm can successfully recognize unknown surgical tools and provide good predictions. Also, our MMS approach could achieve inference speeds of about 40 frames per second (fps) and is suitable to deal with the real-time video segmentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题