论文标题
我能相信你多少? - 量化解释神经网络的不确定性
How Much Can I Trust You? -- Quantifying Uncertainties in Explaining Neural Networks
论文作者
论文摘要
可解释的AI(XAI)旨在为学习机器(例如深神经网络)做出的预测提供解释,以使机器对用户更加透明,并为例如在例如中的应用程序提供信任。关键领域。然而,到目前为止,还没有构想量化解释不确定性的方法,这在对解释具有很高信心的领域中是有问题的。因此,我们通过提出一个新框架来做出贡献,该框架允许将神经网络的任何任意解释方法转换为贝叶斯神经网络的解释方法,并具有对不确定性的内置建模。在贝叶斯框架内,网络的权重遵循一个分布,该分布将标准的单个解释得分和热图扩展到其分布,以这种方式将内在网络模型不确定性转化为解释不确定性的量化。这使我们首次能够解释与模型解释相关的不确定性,然后衡量用户(使用百分位数)的适当解释信心。我们在定性和定量上都广泛地证明了方法在各种实验中的有效性和实用性。
Explainable AI (XAI) aims to provide interpretations for predictions made by learning machines, such as deep neural networks, in order to make the machines more transparent for the user and furthermore trustworthy also for applications in e.g. safety-critical areas. So far, however, no methods for quantifying uncertainties of explanations have been conceived, which is problematic in domains where a high confidence in explanations is a prerequisite. We therefore contribute by proposing a new framework that allows to convert any arbitrary explanation method for neural networks into an explanation method for Bayesian neural networks, with an in-built modeling of uncertainties. Within the Bayesian framework a network's weights follow a distribution that extends standard single explanation scores and heatmaps to distributions thereof, in this manner translating the intrinsic network model uncertainties into a quantification of explanation uncertainties. This allows us for the first time to carve out uncertainties associated with a model explanation and subsequently gauge the appropriate level of explanation confidence for a user (using percentiles). We demonstrate the effectiveness and usefulness of our approach extensively in various experiments, both qualitatively and quantitatively.