使用固有解释的分类模型及其应用于脑肿瘤分类的弱监督分割

论文标题

使用固有解释的分类模型及其应用于脑肿瘤分类的弱监督分割

Weakly-supervised segmentation using inherently-explainable classification models and their application to brain tumour classification

论文作者

Chatterjee, Soumick, Yassin, Hadya, Dubost, Florian, Nürnberger, Andreas, Speck, Oliver

论文摘要

深度学习模型显示了它们对多种应用的潜力。但是，由于其复杂的推理，大多数模型都是不透明的，难以信任 - 通常称为黑盒问题。某些领域（例如医学）需要高度透明度来接受和采用此类技术。因此，需要在分类器上创建可解释/可解释的模型或应用事后方法以在深度学习模型中建立信任。此外，深度学习方法可用于分割任务，该任务通常需要难以验证的，耗时的手动宣布的分段标签进行培训。本文介绍了三个固有的可解释的分类器，以解决这两个问题。可以直接解释网络提供的本地化热图（代表模型的焦点区域并用于分类决策中），而无需任何事后方法来推导信息以进行模型说明。这些模型是通过使用输入图像的培训，仅以监督方式将分类标签作为地面真实性 - 无需使用有关感兴趣区域位置的任何信息（即细分标签），从而使模型的细分培训通过分类标签弱地避免。最终的分割是通过阈值这些热图获得的。这些模型是使用两个不同数据集用于多级脑肿瘤分类的任务，导致对监督分类任务的最佳F1分数为0.93，同时确保弱点监督的分割任务的中位数骰子分数为0.67 $ \ pm $ 0.08。此外，仅肿瘤图像子集获得的准确性优于最先进的胶质瘤肿瘤分级二进制分类器，其最佳模型达到了98.7 \％的精度。

Deep learning models have shown their potential for several applications. However, most of the models are opaque and difficult to trust due to their complex reasoning - commonly known as the black-box problem. Some fields, such as medicine, require a high degree of transparency to accept and adopt such technologies. Consequently, creating explainable/interpretable models or applying post-hoc methods on classifiers to build trust in deep learning models are required. Moreover, deep learning methods can be used for segmentation tasks, which typically require hard-to-obtain, time-consuming manually-annotated segmentation labels for training. This paper introduces three inherently-explainable classifiers to tackle both of these problems as one. The localisation heatmaps provided by the networks -- representing the models' focus areas and being used in classification decision-making -- can be directly interpreted, without requiring any post-hoc methods to derive information for model explanation. The models are trained by using the input image and only the classification labels as ground-truth in a supervised fashion - without using any information about the location of the region of interest (i.e. the segmentation labels), making the segmentation training of the models weakly-supervised through classification labels. The final segmentation is obtained by thresholding these heatmaps. The models were employed for the task of multi-class brain tumour classification using two different datasets, resulting in the best F1-score of 0.93 for the supervised classification task while securing a median Dice score of 0.67$\pm$0.08 for the weakly-supervised segmentation task. Furthermore, the obtained accuracy on a subset of tumour-only images outperformed the state-of-the-art glioma tumour grading binary classifiers with the best model achieving 98.7\% accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题