用可解释的深层分类器解释跨域识别

论文标题

用可解释的深层分类器解释跨域识别

Explaining Cross-Domain Recognition with Interpretable Deep Classifier

论文作者

Zhang, Yiheng, Yao, Ting, Qiu, Zhaofan, Mei, Tao

论文摘要

深度学习的最新进展主要是在其内部表示中构建模型，而解释背后的理由和对人类用户的决定是不透明的。这种解释性对于领域的适应性尤其至关重要，域的挑战需要在不同领域开发更多的自适应模型。在本文中，我们提出了一个问题：源域中的每个样本中的每个样本有多少有助于网络对目标域样本的预测。为了解决这个问题，我们设计了一种新颖的可解释的深层分类器（IDC），该分类器将目标样本的最近源样本学习为分类器做出决定的证据。从技术上讲，IDC为每个类别维护一个可区分的内存库，并且内存插槽得出了键值对的形式。密钥记录了判别源样本的特征，该值存储相应的属性，例如描述类别的特征的代表性分数。 IDC计算IDC的输出与源样本的标签之间的损失，以回到范围以调整代表性分数并更新内存库。关于Office Home和Visda-2017数据集的广泛实验表明，我们的IDC导致了一个更容易解释的模型，几乎没有准确性降级，并有效地校准了分类，以实现最佳拒绝选项。更值得注意的是，当以IDC为先前的口译员时，IDC选择的0.1％的源培训数据的利用仍然比使用2017年Visda-2017上的完整培训进行了较高的结果。

The recent advances in deep learning predominantly construct models in their internal representations, and it is opaque to explain the rationale behind and decisions to human users. Such explainability is especially essential for domain adaptation, whose challenges require developing more adaptive models across different domains. In this paper, we ask the question: how much each sample in source domain contributes to the network's prediction on the samples from target domain. To address this, we devise a novel Interpretable Deep Classifier (IDC) that learns the nearest source samples of a target sample as evidence upon which the classifier makes the decision. Technically, IDC maintains a differentiable memory bank for each category and the memory slot derives a form of key-value pair. The key records the features of discriminative source samples and the value stores the corresponding properties, e.g., representative scores of the features for describing the category. IDC computes the loss between the output of IDC and the labels of source samples to back-propagate to adjust the representative scores and update the memory banks. Extensive experiments on Office-Home and VisDA-2017 datasets demonstrate that our IDC leads to a more explainable model with almost no accuracy degradation and effectively calibrates classification for optimum reject options. More remarkably, when taking IDC as a prior interpreter, capitalizing on 0.1% source training data selected by IDC still yields superior results than that uses full training set on VisDA-2017 for unsupervised domain adaptation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题