狼牙棒：反事实说明的有效的模型框架框架

论文标题

狼牙棒：反事实说明的有效的模型框架框架

MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation

论文作者

Yang, Wenzhuo, Li, Jia, Xiong, Caiming, Hoi, Steven C. H.

论文摘要

反事实解释是解释机器学习预测的重要解释AI技术。尽管被积极研究，但现有的基于优化的方法通常假定基本的机器学习模型是可区分的，并且将分类属性视为连续的属性，当分类属性具有许多不同的值或该模型是不可差的时，这会限制其实际应用程序。为了使反事实说明适合现实世界应用，我们提出了一个新颖的模型反事实解释（MACE）的新型框架，该框架采用了新设计的管道，该管道可以有效地处理大量功能值的非不同机器学习模型。在我们的MACE方法中，我们提出了一种基于RL的新方法，用于寻找良好的反事实例子和一种无梯度的下降方法来改善接近度。公共数据集的实验以更好的有效性，稀疏性和接近性来验证有效性。

Counterfactual explanation is an important Explainable AI technique to explain machine learning predictions. Despite being studied actively, existing optimization-based methods often assume that the underlying machine-learning model is differentiable and treat categorical attributes as continuous ones, which restricts their real-world applications when categorical attributes have many different values or the model is non-differentiable. To make counterfactual explanation suitable for real-world applications, we propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE), which adopts a newly designed pipeline that can efficiently handle non-differentiable machine-learning models on a large number of feature values. in our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题