关于1-Lipschitz神经网络的可解释特性：最佳传输透视

论文标题

关于1-Lipschitz神经网络的可解释特性：最佳传输透视

On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective

论文作者

Serrurier, Mathieu, Mamalet, Franck, Fel, Thomas, Béthune, Louis, Boissin, Thibaut

论文摘要

输入梯度在各种应用中具有关键作用，包括用于评估模型鲁棒性的对抗性攻击算法，可解释的AI技术来产生显着图和反事实说明。但是，传统神经网络产生的显着图通常是嘈杂的，并提供有限的见解。在本文中，我们证明，相反，通过双重丧失最佳运输问题的双重丧失，具有最佳的XAI特性：它们高度集中在图像的基本部分上，具有低噪声的基本部分，与噪声低相比，各种模型和计数器之间的差异很大。我们还证明，这些地图与人类对Imagenet的解释保持了前所未有的良好状态。为了解释此类模型显着性图的特别有益的特性，我们证明了这种梯度编码运输计划的方向和朝着最近的对抗攻击的方向。沿梯度向下到决策边界不再被视为对抗性攻击，而是一种反事实的解释，该解释将输入从一个班级转移到另一类。因此，以这种损失的学习能够共同优化梯度的分类目标和对齐方式，即显着性图，与运输计划方向。以前已知这些网络通过设计证实是可靠的，我们证明它们可以很好地扩展到大型问题和模型，并且可以使用快速且直接的方法来量身定制能力。

Input gradients have a pivotal role in a variety of applications, including adversarial attack algorithms for evaluating model robustness, explainable AI techniques for generating Saliency Maps, and counterfactual explanations.However, Saliency Maps generated by traditional neural networks are often noisy and provide limited insights. In this paper, we demonstrate that, on the contrary, the Saliency Maps of 1-Lipschitz neural networks, learned with the dual loss of an optimal transportation problem, exhibit desirable XAI properties:They are highly concentrated on the essential parts of the image with low noise, significantly outperforming state-of-the-art explanation approaches across various models and metrics. We also prove that these maps align unprecedentedly well with human explanations on ImageNet.To explain the particularly beneficial properties of the Saliency Map for such models, we prove this gradient encodes both the direction of the transportation plan and the direction towards the nearest adversarial attack. Following the gradient down to the decision boundary is no longer considered an adversarial attack, but rather a counterfactual explanation that explicitly transports the input from one class to another. Thus, Learning with such a loss jointly optimizes the classification objective and the alignment of the gradient, i.e. the Saliency Map, to the transportation plan direction.These networks were previously known to be certifiably robust by design, and we demonstrate that they scale well for large problems and models, and are tailored for explainability using a fast and straightforward method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题