论文标题
可解释的图形胶囊网络用于对象识别
Interpretable Graph Capsule Networks for Object Recognition
论文作者
论文摘要
胶囊网络是卷积神经网络的替代方案,已提议从图像中识别对象。当前的文献证明了瓶颈比CNN的许多优势。但是,如何为胶囊的单个分类创建解释尚未得到很好的探索。广泛使用的显着性方法主要用于解释基于CNN的分类。它们通过组合激活值和相应的梯度,例如Grad-CAM来创建显着图的解释。这些显着性方法需要基础分类器的特定架构,并且由于其中的迭代路由机制,不能在瓶颈上琐碎地应用。为了克服缺乏可解释性,我们可以提出有关封装的新的后解释方法,或者修改模型以具有构建解释。在这项工作中,我们探索了后者。具体来说,我们提出了可解释的图形胶囊网络(GracapsNets),在此我们用基于多头注意的图形池池替换路由零件。在拟议的模型中,可以有效,有效地创建个人分类解释。我们的模型也证明了一些意想不到的好处,即使它取代了封装的基本部分。与CAPSNET相比,我们的gracapsnets具有更少的参数和更好的对抗性鲁棒性,获得更好的分类性能。此外,gracapsnets还保留了其他优势,即散布的表示和仿射转换鲁棒性。
Capsule Networks, as alternatives to Convolutional Neural Networks, have been proposed to recognize objects from images. The current literature demonstrates many advantages of CapsNets over CNNs. However, how to create explanations for individual classifications of CapsNets has not been well explored. The widely used saliency methods are mainly proposed for explaining CNN-based classifications; they create saliency map explanations by combining activation values and the corresponding gradients, e.g., Grad-CAM. These saliency methods require a specific architecture of the underlying classifiers and cannot be trivially applied to CapsNets due to the iterative routing mechanism therein. To overcome the lack of interpretability, we can either propose new post-hoc interpretation methods for CapsNets or modifying the model to have build-in explanations. In this work, we explore the latter. Specifically, we propose interpretable Graph Capsule Networks (GraCapsNets), where we replace the routing part with a multi-head attention-based Graph Pooling approach. In the proposed model, individual classification explanations can be created effectively and efficiently. Our model also demonstrates some unexpected benefits, even though it replaces the fundamental part of CapsNets. Our GraCapsNets achieve better classification performance with fewer parameters and better adversarial robustness, when compared to CapsNets. Besides, GraCapsNets also keep other advantages of CapsNets, namely, disentangled representations and affine transformation robustness.