通过3D输入引入视觉变压器进行阿尔茨海默氏病分类任务

论文标题

通过3D输入引入视觉变压器进行阿尔茨海默氏病分类任务

Introducing Vision Transformer for Alzheimer's Disease classification task with 3D input

论文作者

Zhang, Zilun, Khalvati, Farzad

论文摘要

许多高性能分类模型都利用基于CNN的复杂体系结构进行阿尔茨海默氏病分类。我们的目的是使用MRI调查有关阿尔茨海默氏病分类的两个相关问题：“基于视觉变压器的模型是否比基于CNN的模型表现更好？”并且“是否可以使用基于3D CNN的浅层模型获得令人满意的结果？”为了实现这些目标，我们提出了两个可以采用和处理3D MRI扫描的模型：卷积素素视觉变压器（CVVT）体系结构和Convnet3D-4，这是一个浅4块基于3D CNN的模型。我们的结果表明，基于3D CNN的浅层模型足以使用MRI扫描来为阿尔茨海默氏病获得良好的分类结果。

Many high-performance classification models utilize complex CNN-based architectures for Alzheimer's Disease classification. We aim to investigate two relevant questions regarding classification of Alzheimer's Disease using MRI: "Do Vision Transformer-based models perform better than CNN-based models?" and "Is it possible to use a shallow 3D CNN-based model to obtain satisfying results?" To achieve these goals, we propose two models that can take in and process 3D MRI scans: Convolutional Voxel Vision Transformer (CVVT) architecture, and ConvNet3D-4, a shallow 4-block 3D CNN-based model. Our results indicate that the shallow 3D CNN-based models are sufficient to achieve good classification results for Alzheimer's Disease using MRI scans.

下载PDF全文

下载文献需遵守相关版权规定

论文标题