使用医学数据集研究联合图神经网络的可预测性可重复性

论文标题

使用医学数据集研究联合图神经网络的可预测性可重复性

Investigating the Predictive Reproducibility of Federated Graph Neural Networks using Medical Datasets

论文作者

Balik, Mehmet Yigit, Rekik, Arwa, Rekik, Islem

论文摘要

图形神经网络（GNN）在包括田野医学成像和网络神经科学在内的各个领域都实现了非凡的增强，在诊断自闭症等挑战性神经系统疾病方面，它们表现出很高的准确性。面对医学数据稀缺和高私人性，培训此类渴望数据的模型仍然具有挑战性。联合学习通过允许在多个数据集上培训模型，从而以完全保存数据的方式训练多个数据集，从而为该问题提供了有效的解决方案。尽管最先进的GNN和联合学习技术都侧重于提高分类准确性，但它们忽略了一个关键的未解决问题：研究GNN模型在联合学习范式中选择的最歧视性生物标志物（即特征）的可重复性。量化预测性医学模型的可重复性，以防止培训和测试数据分布的扰动，这是克服转化临床应用时要克服的最大障碍之一。据我们所知，这介绍了第一项研究联合GNN模型的可重复性，并应用了对医学成像和大脑连接数据集进行分类的应用。我们使用在医学成像和连接数据集进行培训的各种GNN模型评估了我们的框架。更重要的是，我们表明，联邦学习可以提高GNN模型在此类医学学习任务中的准确性和可重复性。我们的源代码可在https://github.com/basiralab/reproduciblefedgnn上找到。

Graph neural networks (GNNs) have achieved extraordinary enhancements in various areas including the fields medical imaging and network neuroscience where they displayed a high accuracy in diagnosing challenging neurological disorders such as autism. In the face of medical data scarcity and high-privacy, training such data-hungry models remains challenging. Federated learning brings an efficient solution to this issue by allowing to train models on multiple datasets, collected independently by different hospitals, in fully data-preserving manner. Although both state-of-the-art GNNs and federated learning techniques focus on boosting classification accuracy, they overlook a critical unsolved problem: investigating the reproducibility of the most discriminative biomarkers (i.e., features) selected by the GNN models within a federated learning paradigm. Quantifying the reproducibility of a predictive medical model against perturbations of training and testing data distributions presents one of the biggest hurdles to overcome in developing translational clinical applications. To the best of our knowledge, this presents the first work investigating the reproducibility of federated GNN models with application to classifying medical imaging and brain connectivity datasets. We evaluated our framework using various GNN models trained on medical imaging and connectomic datasets. More importantly, we showed that federated learning boosts both the accuracy and reproducibility of GNN models in such medical learning tasks. Our source code is available at https://github.com/basiralab/reproducibleFedGNN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题