论文标题
一种比较基于规则的算法的可解释性的新方法
A New Method to Compare the Interpretability of Rule-based Algorithms
论文作者
论文摘要
可解释性对于预测模型分析变得越来越重要。不幸的是,正如许多作者所说,关于这个概念仍然没有共识。本文的目的是提出分数的定义,该分数可以快速比较可解释的算法。该定义由三个术语组成,每个术语都用一个简单的公式进行定量测量:预测性,稳定性和简单性。虽然已经对预测性进行了广泛的研究以测量预测算法的准确性,但稳定性基于骰子 - 索伦森索引,用于比较使用两个独立样本的算法生成的两个规则集。简单性基于从预测模型得出的规则的长度之和。拟议的分数是上述三个术语的加权总和。我们使用此分数比较了一组基于规则的算法和基于树的算法的解释性,用于回归案例和分类案例。
Interpretability is becoming increasingly important for predictive model analysis. Unfortunately, as remarked by many authors, there is still no consensus regarding this notion. The goal of this paper is to propose the definition of a score that allows to quickly compare interpretable algorithms. This definition consists of three terms, each one being quantitatively measured with a simple formula: predictivity, stability and simplicity. While predictivity has been extensively studied to measure the accuracy of predictive algorithms, stability is based on the Dice-Sorensen index for comparing two rule sets generated by an algorithm using two independent samples. The simplicity is based on the sum of the lengths of the rules derived from the predictive model. The proposed score is a weighted sum of the three terms mentioned above. We use this score to compare the interpretability of a set of rule-based algorithms and tree-based algorithms for the regression case and for the classification case.