论文标题
基于损失拟南索的方法中的异常行为
Anomalous behaviour in loss-gradient based interpretability methods
论文作者
论文摘要
损失率用于解释深度学习模型的决策过程。在这项工作中,我们通过遮挡输入的一部分并将封闭输入的性能与原始输入进行比较来评估基于损耗的归因方法。我们观察到,在某些条件下,阻塞输入的性能比测试数据集的原始性能更好。在声音和图像识别任务中观察到类似的行为。我们探索不同的损失征收归因方法,遮挡水平和替换值,以解释遮挡下性能改善的现象。
Loss-gradients are used to interpret the decision making process of deep learning models. In this work, we evaluate loss-gradient based attribution methods by occluding parts of the input and comparing the performance of the occluded input to the original input. We observe that the occluded input has better performance than the original across the test dataset under certain conditions. Similar behaviour is observed in sound and image recognition tasks. We explore different loss-gradient attribution methods, occlusion levels and replacement values to explain the phenomenon of performance improvement under occlusion.