一个合成所有这些的模型：多对比度多尺度变压器缺少数据插补

论文标题

一个合成所有这些的模型：多对比度多尺度变压器缺少数据插补

One Model to Synthesize Them All: Multi-contrast Multi-scale Transformer for Missing Data Imputation

论文作者

Liu, Jiang, Pasumarthi, Srivathsa, Duffy, Ben, Gong, Enhao, Datta, Keshav, Zaharchuk, Greg

论文摘要

多对比度磁共振成像（MRI）在临床实践中广泛使用，因为每个对比都提供了互补信息。但是，每种成像对比度的可用性可能在患者中有所不同，这对放射学家和自动图像分析算法构成了挑战。解决此问题的一般方法是缺少数据插补，该方法旨在综合现有的对比度。尽管已经提出了几种基于卷积的神经网络（CNN）算法，但它们遭受了CNN模型的基本局限性，例如对固定输入和输出渠道数量的要求，无法捕获长期依赖性以及缺乏可解释性。在这项工作中，我们将缺失的数据归纳为序列到序列学习问题，并提出了多对比度的多尺度变压器（MMT），可以将输入对比的任何子集并综合丢失的输入对比。 MMT由一个多尺度变压器编码器组成，该编码器构建输入的层次表示，并与多尺度的变压器解码器结合使用，该解码器以粗到5的方式生成输出。提出的多对比度SWIN变压器块可以有效地捕获内部和对比度依赖性以获得准确的图像合成。此外，MMT可以固有地解释，因为它使我们能够通过分析解码器中变压器块的内部注意力图来了解不同区域中每个输入对比的重要性。对两个大规模多对比度MRI数据集进行了广泛的实验表明，MMT在定量和质量上都优于最先进的方法。

Multi-contrast magnetic resonance imaging (MRI) is widely used in clinical practice as each contrast provides complementary information. However, the availability of each imaging contrast may vary amongst patients, which poses challenges to radiologists and automated image analysis algorithms. A general approach for tackling this problem is missing data imputation, which aims to synthesize the missing contrasts from existing ones. While several convolutional neural networks (CNN) based algorithms have been proposed, they suffer from the fundamental limitations of CNN models, such as the requirement for fixed numbers of input and output channels, the inability to capture long-range dependencies, and the lack of interpretability. In this work, we formulate missing data imputation as a sequence-to-sequence learning problem and propose a multi-contrast multi-scale Transformer (MMT), which can take any subset of input contrasts and synthesize those that are missing. MMT consists of a multi-scale Transformer encoder that builds hierarchical representations of inputs combined with a multi-scale Transformer decoder that generates the outputs in a coarse-to-fine fashion. The proposed multi-contrast Swin Transformer blocks can efficiently capture intra- and inter-contrast dependencies for accurate image synthesis. Moreover, MMT is inherently interpretable as it allows us to understand the importance of each input contrast in different regions by analyzing the in-built attention maps of Transformer blocks in the decoder. Extensive experiments on two large-scale multi-contrast MRI datasets demonstrate that MMT outperforms the state-of-the-art methods quantitatively and qualitatively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题