使用自信的项目集对黑盒分类器的事后解释

论文标题

使用自信的项目集对黑盒分类器的事后解释

Post-hoc explanation of black-box classifiers using confident itemsets

论文作者

Moradi, Milad, Samwald, Matthias

论文摘要

黑盒人工智能（AI）方法，例如深度神经网络已被广泛用于构建可以在数据集中提取复杂关系的预测模型，并对新看不见的数据记录进行预测。但是，由于用户隐藏了这种方法，因此很难信任通过这种方法做出的决策。可解释的人工智能（XAI）是指试图解释黑盒AI模型如何产生其结果的系统。 HOC XAI方法通过提取特征值和预测之间的关系来近似黑框的行为。基于扰动的基于扰动和决策集方法是常用的HOC XAI系统。前说明器依靠数据记录的随机扰动来构建解释个人预测或整个模型的本地或全局线性模型。后者的解释器使用这些特征值，这些特征值似乎更频繁地构建一组决策规则，该决策规则与目标黑框相同的结果。但是，这两类XAI方法有一些局限性。随机扰动没有考虑到不同子空间中特征值的分布，从而导致误导近似值。决策集仅关注频繁的特征值，并错过了特征和类标签之间的许多重要相关性，这些功能和类标签显得较少但准确地代表模型的决策边界。在本文中，我们通过提出一种名为“自信项目”解释（CIE）的解释方法来解决上述挑战。我们介绍了自信的项目集，这是一组与特定类标签高度关联的特征值。 CIE利用自信的项目集将模型的整个决策空间离散到较小的子空间。

Black-box Artificial Intelligence (AI) methods, e.g. deep neural networks, have been widely utilized to build predictive models that can extract complex relationships in a dataset and make predictions for new unseen data records. However, it is difficult to trust decisions made by such methods since their inner working and decision logic is hidden from the user. Explainable Artificial Intelligence (XAI) refers to systems that try to explain how a black-box AI model produces its outcomes. Post-hoc XAI methods approximate the behavior of a black-box by extracting relationships between feature values and the predictions. Perturbation-based and decision set methods are among commonly used post-hoc XAI systems. The former explanators rely on random perturbations of data records to build local or global linear models that explain individual predictions or the whole model. The latter explanators use those feature values that appear more frequently to construct a set of decision rules that produces the same outcomes as the target black-box. However, these two classes of XAI methods have some limitations. Random perturbations do not take into account the distribution of feature values in different subspaces, leading to misleading approximations. Decision sets only pay attention to frequent feature values and miss many important correlations between features and class labels that appear less frequently but accurately represent decision boundaries of the model. In this paper, we address the above challenges by proposing an explanation method named Confident Itemsets Explanation (CIE). We introduce confident itemsets, a set of feature values that are highly correlated to a specific class label. CIE utilizes confident itemsets to discretize the whole decision space of a model to smaller subspaces.

下载PDF全文

下载文献需遵守相关版权规定

论文标题