论文标题
腐殖网:混合展开的多尺度网络体系结构,用于加速MRI重建
HUMUS-Net: Hybrid unrolled multi-scale network architecture for accelerated MRI reconstruction
论文作者
论文摘要
在加速的MRI重建中,从一组不采样且嘈杂的测量值中回收了患者的解剖结构。已证明深度学习方法可以成功地解决这个不良的反问题,并能够产生非常高质量的重建。但是,当前的体系结构在很大程度上依赖于与内容无关的卷积,并且在图像中的长距离依赖性建模很难。最近,《变形金刚》(当代自然语言处理的主力)已成为多种视觉任务的强大组成部分。这些模型将输入图像分为非重叠的贴片,将斑块嵌入到较低的代币中,并利用一种自我注意的机制,该机制不受上述卷积架构的弱点的困扰。但是,当1)输入图像分辨率较高时,发生了极高的计算和内存成本,而2)需要将图像分为大量贴片以保留细节的细节信息,这两者都是典型的低级视觉问题(例如MRI重建),例如具有复合效果。为了应对这些挑战,我们提出了腐殖网,这是一种混合体系结构,将卷积的有益隐式偏见和效率结合在一起,并在展开和多规模的网络中与变压器块的力量相结合。腐殖网通过卷积块提取高分辨率特征,并通过新型的基于变压器的多尺度提取器来完善低分辨率特征。然后将两个级别的特征合成为高分辨率输出重建。我们的网络在最大的公开MRI数据集《 FastMRI数据集》上建立了新的艺术状态。我们进一步证明了在其他两个流行的MRI数据集上进行腐殖网络的性能,并进行细化的消融研究以验证我们的设计。
In accelerated MRI reconstruction, the anatomy of a patient is recovered from a set of under-sampled and noisy measurements. Deep learning approaches have been proven to be successful in solving this ill-posed inverse problem and are capable of producing very high quality reconstructions. However, current architectures heavily rely on convolutions, that are content-independent and have difficulties modeling long-range dependencies in images. Recently, Transformers, the workhorse of contemporary natural language processing, have emerged as powerful building blocks for a multitude of vision tasks. These models split input images into non-overlapping patches, embed the patches into lower-dimensional tokens and utilize a self-attention mechanism that does not suffer from the aforementioned weaknesses of convolutional architectures. However, Transformers incur extremely high compute and memory cost when 1) the input image resolution is high and 2) when the image needs to be split into a large number of patches to preserve fine detail information, both of which are typical in low-level vision problems such as MRI reconstruction, having a compounding effect. To tackle these challenges, we propose HUMUS-Net, a hybrid architecture that combines the beneficial implicit bias and efficiency of convolutions with the power of Transformer blocks in an unrolled and multi-scale network. HUMUS-Net extracts high-resolution features via convolutional blocks and refines low-resolution features via a novel Transformer-based multi-scale feature extractor. Features from both levels are then synthesized into a high-resolution output reconstruction. Our network establishes new state of the art on the largest publicly available MRI dataset, the fastMRI dataset. We further demonstrate the performance of HUMUS-Net on two other popular MRI datasets and perform fine-grained ablation studies to validate our design.