论文标题
镰状细胞疾病诊断支持选择最合适的机械学习方法:从显微镜图像中迈向一般且可解释的细胞表体分析方法
Sickle-cell disease diagnosis support selecting the most appropriate machinelearning method: Towards a general and interpretable approach for cellmorphology analysis from microscopy images
论文作者
论文摘要
在这项工作中,我们提出了一种基于最先进的分类方法和特征的方法,并通过红细胞的外周血涂片图像来提供最佳性能,以提供诊断的最佳性能。在我们的案例中,我们使用的是镰状细胞疾病的患者样本,这些患者可以概括为其他研究病例。为了信任拟议系统的行为,我们还分析了解释性。 我们对微观图像进行了预处理和分段,以确保高特征质量。我们应用了文献中使用的方法来从血细胞中提取特征和机器学习方法来对其形态进行分类。接下来,我们从特征提取阶段中的结果数据中搜索了它们的最佳参数。然后,我们使用随机和网格搜索找到了每个分类器的最佳参数。 为了进行科学进步,我们为每个分类器,实施的代码库,与原始数据的混淆矩阵发布了参数,我们使用了公共erythrocytesidb数据集进行验证。我们还定义了如何选择分类的最重要特征,以减少不透明模型中的复杂性和训练时间以及可解释性目的。最后,将最佳性能分类方法与最先进的分类方法进行比较,即使使用可解释的模型分类器,我们也获得了更好的结果。
In this work we propose an approach to select the classification method and features, based on the state-of-the-art, with best performance for diagnostic support through peripheral blood smear images of red blood cells. In our case we used samples of patients with sickle-cell disease which can be generalized for other study cases. To trust the behavior of the proposed system, we also analyzed the interpretability. We pre-processed and segmented microscopic images, to ensure high feature quality. We applied the methods used in the literature to extract the features from blood cells and the machine learning methods to classify their morphology. Next, we searched for their best parameters from the resulting data in the feature extraction phase. Then, we found the best parameters for every classifier using Randomized and Grid search. For the sake of scientific progress, we published parameters for each classifier, the implemented code library, the confusion matrices with the raw data, and we used the public erythrocytesIDB dataset for validation. We also defined how to select the most important features for classification to decrease the complexity and the training time, and for interpretability purpose in opaque models. Finally, comparing the best performing classification methods with the state-of-the-art, we obtained better results even with interpretable model classifiers.