使用蒙版自动编码器进行医学图像分类和细分的自我预训练

论文标题

使用蒙版自动编码器进行医学图像分类和细分的自我预训练

Self Pre-training with Masked Autoencoders for Medical Image Classification and Segmentation

论文作者

Zhou, Lei, Liu, Huidong, Bae, Joseph, He, Junjun, Samaras, Dimitris, Prasanna, Prateek

论文摘要

最近已证明蒙面自动编码器（MAE）在训练前视觉变压器（VIT）中有效进行自然图像分析。通过从部分掩盖的输入中重建完整图像，VIT编码器汇总了上下文信息以推断掩盖的图像区域。我们认为，这种上下文聚集能力对于每个解剖结构在功能和机械上与其他结构和区域连接的医学图像域特别重要。由于没有用于预训练的Imagenet规模的医学图像数据集，因此我们研究了使用MAE进行医学图像分析任务的自我训练范式。我们的方法预先培训目标数据的训练集，而不是另一个数据集。因此，自我预训练可以使更多的方案受益，而预培训数据很难获取。我们的实验结果表明，MAE自我预训练明显改善了各种医学图像任务，包括胸部X射线疾病分类，腹部CT多器官分割和MRI脑肿瘤分割。代码可从https://github.com/cvlab-stonybrook/selpmedmae获得

Masked Autoencoder (MAE) has recently been shown to be effective in pre-training Vision Transformers (ViT) for natural image analysis. By reconstructing full images from partially masked inputs, a ViT encoder aggregates contextual information to infer masked image regions. We believe that this context aggregation ability is particularly essential to the medical image domain where each anatomical structure is functionally and mechanically connected to other structures and regions. Because there is no ImageNet-scale medical image dataset for pre-training, we investigate a self pre-training paradigm with MAE for medical image analysis tasks. Our method pre-trains a ViT on the training set of the target data instead of another dataset. Thus, self pre-training can benefit more scenarios where pre-training data is hard to acquire. Our experimental results show that MAE self pre-training markedly improves diverse medical image tasks including chest X-ray disease classification, abdominal CT multi-organ segmentation, and MRI brain tumor segmentation. Code is available at https://github.com/cvlab-stonybrook/SelfMedMAE

下载PDF全文

下载文献需遵守相关版权规定

论文标题