论文标题
学习大脑MRI质量控制:一个多因素概括问题
Learning brain MRI quality control: a multi-factorial generalization problem
论文作者
论文摘要
由于MRI数据的数量越来越多,自动化质量控制(QC)已成为必不可少的,尤其是对于大规模分析。为了开发可靠且可扩展的QC管道,已经进行了几次尝试。但是,由于MRI数据固有的偏见,这些方法在新数据上的概括与新数据无关。这项工作旨在评估MRIQC管道在各种大规模数据集上的性能(Abide,n = 1102和Cati派生的数据集,n = 9037)用于培训和评估目的。我们将分析重点放在MRIQC预处理步骤上,并在有无方面测试了管道。我们进一步分析了在网站和研究方面的预测分类概率分布,而没有对遵守和CATI数据进行培训的预处理。我们的主要结果是,使用从MRIQC提取的功能而无需预处理的模型,当在具有异质种群的大型多中心数据集中进行了训练和评估时,在训练和评估时产生了最佳结果(在catiatet的子集的模型中,在不看到的0.10的ROC-AUC分数上提高了ROC-AUC分数)。我们得出的结论是,一个经过培训的模型,该模型(例如CATI数据集)在看不见的数据上提供了最佳分数。尽管绩效提高,但在查看网站/研究概率预测以及从中得出的最佳分类阈值时,模型的概括能力仍然值得怀疑。
Due to the growing number of MRI data, automated quality control (QC) has become essential, especially for larger scale analysis. Several attempts have been made in order to develop reliable and scalable QC pipelines. However, the generalization of these methods on new data independent of those used for learning is a difficult problem because of the biases inherent in MRI data. This work aimed at evaluating the performances of the MRIQC pipeline on various large-scale datasets (ABIDE, N = 1102 and CATI derived datasets, N = 9037) used for both training and evaluation purposes. We focused our analysis on the MRIQC preprocessing steps and tested the pipeline with and without them. We further analyzed the site-wise and study-wise predicted classification probability distributions of the models without preprocessing trained on ABIDE and CATI data. Our main results were that a model using features extracted from MRIQC without preprocessing yielded the best results when trained and evaluated on large multi-center datasets with a heterogeneous population (an improvement of the ROC-AUC score on unseen data of 0.10 for the model trained on a subset of the CATI dataset). We concluded that a model trained with data from a heterogeneous population, such as the CATI dataset, provides the best scores on unseen data. In spite of the performance improvement, the generalization abilities of the models remain questionable when looking at the site-wise/study-wise probability predictions and the optimal classification threshold derived from them.