论文标题
狼牙棒:反事实说明的有效的模型框架框架
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation
论文作者
论文摘要
反事实解释是解释机器学习预测的重要解释AI技术。尽管被积极研究,但现有的基于优化的方法通常假定基本的机器学习模型是可区分的,并且将分类属性视为连续的属性,当分类属性具有许多不同的值或该模型是不可差的时,这会限制其实际应用程序。为了使反事实说明适合现实世界应用,我们提出了一个新颖的模型反事实解释(MACE)的新型框架,该框架采用了新设计的管道,该管道可以有效地处理大量功能值的非不同机器学习模型。在我们的MACE方法中,我们提出了一种基于RL的新方法,用于寻找良好的反事实例子和一种无梯度的下降方法来改善接近度。公共数据集的实验以更好的有效性,稀疏性和接近性来验证有效性。
Counterfactual explanation is an important Explainable AI technique to explain machine learning predictions. Despite being studied actively, existing optimization-based methods often assume that the underlying machine-learning model is differentiable and treat categorical attributes as continuous ones, which restricts their real-world applications when categorical attributes have many different values or the model is non-differentiable. To make counterfactual explanation suitable for real-world applications, we propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE), which adopts a newly designed pipeline that can efficiently handle non-differentiable machine-learning models on a large number of feature values. in our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.