论文标题
数据自适应判别特征本地化具有统计保证的解释
Data-Adaptive Discriminative Feature Localization with Statistically Guaranteed Interpretation
论文作者
论文摘要
在可解释的人工智能中,歧视特征定位对于揭示黑框模型从原始数据到预测的决策过程至关重要。在本文中,我们使用两个真正的数据集,即MNIST手写数字和MIT-BIH心电图(ECG)信号,以激励判别特征的关键特征,即适应性,预测性重要性和有效性。然后,我们基于对抗性攻击开发了一个本地化框架,以有效地定位判别特征。与现有的启发式方法相反,我们还通过测量广义的部分$ r^2 $来提供对本地化功能的统计保证能力。我们将提出的方法应用于使用卷积自动编码器的MNIST数据集和MIT-BIH数据集。首先,通过拟议方法定位的紧凑图像区域在视觉上具有吸引力。同样,在第二个中,确定的ECG特征在生物学上是合理的,并且与心脏电生理学原理一致,同时在QRS复合体中定位微妙的异常,而肉眼可能无法识别。总体而言,提出的方法与最先进的竞争对手进行了比较。随附的本文是Python库DNN-Locate(https://dnn-locate.readthedocs.io/en/latest/),可实现所提出的方法。
In explainable artificial intelligence, discriminative feature localization is critical to reveal a blackbox model's decision-making process from raw data to prediction. In this article, we use two real datasets, the MNIST handwritten digits and MIT-BIH Electrocardiogram (ECG) signals, to motivate key characteristics of discriminative features, namely adaptiveness, predictive importance and effectiveness. Then, we develop a localization framework based on adversarial attacks to effectively localize discriminative features. In contrast to existing heuristic methods, we also provide a statistically guaranteed interpretability of the localized features by measuring a generalized partial $R^2$. We apply the proposed method to the MNIST dataset and the MIT-BIH dataset with a convolutional auto-encoder. In the first, the compact image regions localized by the proposed method are visually appealing. Similarly, in the second, the identified ECG features are biologically plausible and consistent with cardiac electrophysiological principles while locating subtle anomalies in a QRS complex that may not be discernible by the naked eye. Overall, the proposed method compares favorably with state-of-the-art competitors. Accompanying this paper is a Python library dnn-locate (https://dnn-locate.readthedocs.io/en/latest/) that implements the proposed approach.