使用高维蛋白表达数据的可解释深度学习，以介绍线粒体疾病

论文标题

使用高维蛋白表达数据的可解释深度学习，以介绍线粒体疾病

Explainable Deep Learning to Profile Mitochondrial Disease Using High Dimensional Protein Expression Data

论文作者

Khan, Atif, Lawless, Conor, Vincent, Amy E, Pilla, Satish, Ramesh, Sushanth, McGough, A. Stephen

论文摘要

线粒体疾病目前无法治疗，因为我们对其病理的理解有限。我们研究了使用成像质量细胞仪（IMC）发现骨骼肌纤维（SM）中各种线粒体蛋白的表达。 IMC产生高维多通道伪图像，代表组织内一组蛋白质表达的空间变化，包括亚细胞变异。这些图像的统计分析需要在患者肌肉活检的IMC图像中对数千个SMS进行半自动注释。在本文中，我们调查了在原始IMC数据上使用深度学习（DL）在没有任何手动预处理步骤，统计摘要或统计模型的情况下对其进行分析。为此，我们首先在所有可用的图像频道上，包括组合和单独的所有图像频道上训练最先进的计算机视觉DL模型。我们观察到的许多模型都比预期的准确性要好。然后，我们采用与计算机视觉DL相关的最新可解释技术来找到这些模型的预测的基础。产生的视觉可解释的地图突出了图像中的特征，这些特征似乎与肌纤维内线粒体疾病进展的最新假设一致。

Mitochondrial diseases are currently untreatable due to our limited understanding of their pathology. We study the expression of various mitochondrial proteins in skeletal myofibres (SM) in order to discover processes involved in mitochondrial pathology using Imaging Mass Cytometry (IMC). IMC produces high dimensional multichannel pseudo-images representing spatial variation in the expression of a panel of proteins within a tissue, including subcellular variation. Statistical analysis of these images requires semi-automated annotation of thousands of SMs in IMC images of patient muscle biopsies. In this paper we investigate the use of deep learning (DL) on raw IMC data to analyse it without any manual pre-processing steps, statistical summaries or statistical models. For this we first train state-of-art computer vision DL models on all available image channels, both combined and individually. We observed better than expected accuracy for many of these models. We then apply state-of-the-art explainable techniques relevant to computer vision DL to find the basis of the predictions of these models. Some of the resulting visual explainable maps highlight features in the images that appear consistent with the latest hypotheses about mitochondrial disease progression within myofibres.

下载PDF全文

下载文献需遵守相关版权规定

论文标题