透气：变压器为深度分割模型提供了强大的空间归一化机制

论文标题

透气：变压器为深度分割模型提供了强大的空间归一化机制

TransNorm: Transformer Provides a Strong Spatial Normalization Mechanism for a Deep Segmentation Model

论文作者

Azad, Reza, AL-Antary, Mohammad T., Heidari, Moein, Merhof, Dorit

论文摘要

在过去的几年中，卷积神经网络（CNN），尤其是U-NET，一直是医学图像处理时代的流行技术。具体来说，开创性的U-NET及其替代方案成功地设法解决了各种医疗图像分割任务。但是，这些体系结构在本质上是不完美的，因为它们无法表现出长距离相互作用和空间依赖性，从而导致分割具有可变形状和结构的医学图像的严重性能下降。针对序列到序列预测的初步提议的变压器已成为替代体系结构，以精确地模拟由自我注意力的机制辅助的全局信息。尽管设计了可行的设计，但利用纯变压器来进行图像分割目的可能导致由于低级特征不足而导致的定位容量有限。因此，一系列研究旨在设计基于变压器的U-NET的强大变体。在本文中，我们提出了Trans-Norm，这是一种新型的深层分割框架，它同时将变压器模块合并到标准U-NET的编码器和跳过连接中。我们认为，跳过连接的权宜设计对于准确的分割至关重要，因为它可以帮助扩展路径和收缩路径之间的特征融合。在这方面，我们从变压器模块中得出了一种空间归一化机制，以适应重新校准跳过连接路径。对医学图像分割的三个典型任务进行的广泛实验证明了透气的有效性。代码和训练的模型可在https://github.com/rezazad68/transnorm上公开获得。

In the past few years, convolutional neural networks (CNNs), particularly U-Net, have been the prevailing technique in the medical image processing era. Specifically, the seminal U-Net, as well as its alternatives, have successfully managed to address a wide variety of medical image segmentation tasks. However, these architectures are intrinsically imperfect as they fail to exhibit long-range interactions and spatial dependencies leading to a severe performance drop in the segmentation of medical images with variable shapes and structures. Transformers, preliminary proposed for sequence-to-sequence prediction, have arisen as surrogate architectures to precisely model global information assisted by the self-attention mechanism. Despite being feasibly designed, utilizing a pure Transformer for image segmentation purposes can result in limited localization capacity stemming from inadequate low-level features. Thus, a line of research strives to design robust variants of Transformer-based U-Net. In this paper, we propose Trans-Norm, a novel deep segmentation framework which concomitantly consolidates a Transformer module into both encoder and skip-connections of the standard U-Net. We argue that the expedient design of skip-connections can be crucial for accurate segmentation as it can assist in feature fusion between the expanding and contracting paths. In this respect, we derive a Spatial Normalization mechanism from the Transformer module to adaptively recalibrate the skip connection path. Extensive experiments across three typical tasks for medical image segmentation demonstrate the effectiveness of TransNorm. The codes and trained models are publicly available at https://github.com/rezazad68/transnorm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题