基于多模式注意的阿尔茨海默氏病诊断的深度学习

论文标题

基于多模式注意的阿尔茨海默氏病诊断的深度学习

Multimodal Attention-based Deep Learning for Alzheimer's Disease Diagnosis

论文作者

Golovanevsky, Michal, Eickhoff, Carsten, Singh, Ritambhara

论文摘要

阿尔茨海默氏病（AD）是最常见的神经退行性疾病，具有最复杂的病原体之一，使有效且临床上可行的决策变得困难。这项研究的目的是开发一个新型的多模式深度学习框架，以帮助医疗专业人员进行AD诊断。我们提出了一个多模式的阿尔茨海默氏病诊断框架（MADDI），以准确检测成像，遗传和临床数据中的AD和轻度认知障碍（MCI）。 Maddi是新颖的，因为我们使用跨模式的注意，该注意捕获了模态之间的相互作用 - 这种域中未探讨的方法。我们执行多级分类，这是MCI和AD之间的强大相似性，这是一项具有挑战性的任务。我们与以前的最先进模型进行比较，评估注意力的重要性，并检查每种模式对模型性能的贡献。 Maddi在持有的测试集中以96.88％的精度对MCI，AD和控件进行了分类。在检查不同注意力方案的贡献时，我们发现跨模式关注与自我注意事项的组合表现出了最佳状态，并且模型中没有注意力层表现最差，而F1分数差异为7.9％。我们的实验强调了结构化临床数据的重要性，以帮助机器学习模型将其背景化和解释其余模式。广泛的消融研究表明，未访问结构化临床信息的任何多模式混合物都遭受了明显的性能损失。这项研究证明了通过跨模式关注组合多种输入方式的优点，以提供高度准确的AD诊断决策支持。

Alzheimer's Disease (AD) is the most common neurodegenerative disorder with one of the most complex pathogeneses, making effective and clinically actionable decision support difficult. The objective of this study was to develop a novel multimodal deep learning framework to aid medical professionals in AD diagnosis. We present a Multimodal Alzheimer's Disease Diagnosis framework (MADDi) to accurately detect the presence of AD and mild cognitive impairment (MCI) from imaging, genetic, and clinical data. MADDi is novel in that we use cross-modal attention, which captures interactions between modalities - a method not previously explored in this domain. We perform multi-class classification, a challenging task considering the strong similarities between MCI and AD. We compare with previous state-of-the-art models, evaluate the importance of attention, and examine the contribution of each modality to the model's performance. MADDi classifies MCI, AD, and controls with 96.88% accuracy on a held-out test set. When examining the contribution of different attention schemes, we found that the combination of cross-modal attention with self-attention performed the best, and no attention layers in the model performed the worst, with a 7.9% difference in F1-Scores. Our experiments underlined the importance of structured clinical data to help machine learning models contextualize and interpret the remaining modalities. Extensive ablation studies showed that any multimodal mixture of input features without access to structured clinical information suffered marked performance losses. This study demonstrates the merit of combining multiple input modalities via cross-modal attention to deliver highly accurate AD diagnostic decision support.

下载PDF全文

下载文献需遵守相关版权规定

论文标题