特征归因解释器的边界意识不确定性

论文标题

特征归因解释器的边界意识不确定性

Boundary-Aware Uncertainty for Feature Attribution Explainers

论文作者

Hill, Davin, Masoomi, Aria, Torop, Max, Ghimire, Sandesh, Dy, Jennifer

论文摘要

事后解释方法已成为了解高风险应用中黑盒分类器的关键工具。但是，高性能分类器通常是高度非线性的，并且可以在决策边界周围表现出复杂的行为，从而导致脆弱或误导局部解释。因此，即将需要量化这种解释方法的不确定性，以了解解释何时值得信赖。在这项工作中，我们提出了高斯流程解释不确定性（GPEC）框架，该框架产生了统一的不确定性估计，将决策边界意识不确定性与解释函数近似不确定性结合在一起。我们介绍了一种新型的基于测量的内核，该内核捕获了目标黑盒决策边界的复杂性。从理论上讲，我们表明所提出的内核相似性随决策边界复杂性而增加。提议的框架非常灵活；它可以与任何黑盒分类器和功能归因方法一起使用。多个表格和图像数据集的经验结果表明，与现有方法相比，GPEC的不确定性估计可以提高对解释的理解。

Post-hoc explanation methods have become a critical tool for understanding black-box classifiers in high-stakes applications. However, high-performing classifiers are often highly nonlinear and can exhibit complex behavior around the decision boundary, leading to brittle or misleading local explanations. Therefore there is an impending need to quantify the uncertainty of such explanation methods in order to understand when explanations are trustworthy. In this work we propose the Gaussian Process Explanation UnCertainty (GPEC) framework, which generates a unified uncertainty estimate combining decision boundary-aware uncertainty with explanation function approximation uncertainty. We introduce a novel geodesic-based kernel, which captures the complexity of the target black-box decision boundary. We show theoretically that the proposed kernel similarity increases with decision boundary complexity. The proposed framework is highly flexible; it can be used with any black-box classifier and feature attribution method. Empirical results on multiple tabular and image datasets show that the GPEC uncertainty estimate improves understanding of explanations as compared to existing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题