当CNN与VIT相遇时：用于多级医学图像语义分割的半监督学习

论文标题

当CNN与VIT相遇时：用于多级医学图像语义分割的半监督学习

When CNN Meet with ViT: Towards Semi-Supervised Learning for Multi-Class Medical Image Semantic Segmentation

论文作者

Wang, Ziyang, Li, Tianze, Zheng, Jian-Qing, Huang, Baoru

论文摘要

由于医学成像社区缺乏质量注释，半监督学习方法在图像语义分段任务中受到高度重视。在本文中，提出了一种先进的一致性感知标签的自我启动方法，以充分利用视觉变压器（VIT）和卷积神经网络（CNN）的力量。我们提出的框架由一个特征学习模块组成，该模块由VIT和CNN相互增强，以及一个适合一致性意识的指导模块。伪标签是通过特征学习模块中的CNN和VIT的视图重复和分别推断出和使用的，以扩展数据集并彼此有益。同时，为特征学习模块设计了扰动方案，并利用平均网络权重来开发指导模块。通过这样做，该框架结合了CNN和VIT的特征学习强度，通过双视图共同训练增强性能，并以半监督的方式实现一致性的监督。对所有使用CNN和VIT进行所有替代监督模式的拓扑探索经过详细验证，这证明了我们在半监督医学图像分割任务上的最有希望的性能和特定设置。实验结果表明，所提出的方法在带有各种指标的公共基准数据集上实现了最先进的性能。该代码公开可用。

Due to the lack of quality annotation in medical imaging community, semi-supervised learning methods are highly valued in image semantic segmentation tasks. In this paper, an advanced consistency-aware pseudo-label-based self-ensembling approach is presented to fully utilize the power of Vision Transformer(ViT) and Convolutional Neural Network(CNN) in semi-supervised learning. Our proposed framework consists of a feature-learning module which is enhanced by ViT and CNN mutually, and a guidance module which is robust for consistency-aware purposes. The pseudo labels are inferred and utilized recurrently and separately by views of CNN and ViT in the feature-learning module to expand the data set and are beneficial to each other. Meanwhile, a perturbation scheme is designed for the feature-learning module, and averaging network weight is utilized to develop the guidance module. By doing so, the framework combines the feature-learning strength of CNN and ViT, strengthens the performance via dual-view co-training, and enables consistency-aware supervision in a semi-supervised manner. A topological exploration of all alternative supervision modes with CNN and ViT are detailed validated, demonstrating the most promising performance and specific setting of our method on semi-supervised medical image segmentation tasks. Experimental results show that the proposed method achieves state-of-the-art performance on a public benchmark data set with a variety of metrics. The code is publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题