论文标题
EMAP:可解释的AI具有基于多种扰动的AI
EMaP: Explainable AI with Manifold-based Perturbations
论文作者
论文摘要
在过去的几年中,已经引入了许多基于输入数据扰动的解释方法,以提高我们对黑盒模型做出的决策的理解。这项工作的目的是引入一种新颖的扰动方案,以便可以获得更忠实和强大的解释。我们的研究重点是扰动方向对数据拓扑的影响。我们表明,沿输入歧管的正交方向的扰动更好地保留了数据拓扑,既在对离散的Gromov-Hausdorff距离的最坏情况分析中,以及通过持久同源性的平均案例分析。从这些结果中,我们引入EMAP算法,实现正交扰动方案。我们的实验表明,EMAP不仅改善了解释者的性能,还可以帮助他们克服最近对基于扰动的方法的攻击。
In the last few years, many explanation methods based on the perturbations of input data have been introduced to improve our understanding of decisions made by black-box models. The goal of this work is to introduce a novel perturbation scheme so that more faithful and robust explanations can be obtained. Our study focuses on the impact of perturbing directions on the data topology. We show that perturbing along the orthogonal directions of the input manifold better preserves the data topology, both in the worst-case analysis of the discrete Gromov-Hausdorff distance and in the average-case analysis via persistent homology. From those results, we introduce EMaP algorithm, realizing the orthogonal perturbation scheme. Our experiments show that EMaP not only improves the explainers' performance but also helps them overcome a recently-developed attack against perturbation-based methods.