论文标题
弱监督语义细分的自我监督的e象注意机制
Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation
论文作者
论文摘要
图像级弱监督的语义细分是一个充满挑战的问题,近年来已经深入研究了。大多数高级解决方案利用了类激活图(CAM)。但是,由于完整的和弱的监督之间的差距,凸轮几乎无法用作对象面具。在本文中,我们提出了一种自我监管的等效注意机制(SEAM),以发现其他监督并缩小差距。我们的方法基于这样的观察结果,即在完全监督的语义分割中,均值是隐式的约束,其像素级标签的空间转换与数据增强过程中的输入图像相同。但是,这种约束因图像级监督而训练的凸轮丢失。因此,我们提出了来自各种转换图像的预测凸轮的一致性正则化,以提供网络学习的自学。此外,我们提出了一个像素相关模块(PCM),该模块利用了上下文外观信息并完善了其相似邻居对当前像素的预测,从而进一步改善了CAM的一致性。 Pascal VOC 2012数据集的广泛实验表明,使用相同水平的监督,我们的方法优于最先进的方法。该代码在线发布。
Image-level weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years. Most of advanced solutions exploit class activation map (CAM). However, CAMs can hardly serve as the object mask due to the gap between full and weak supervisions. In this paper, we propose a self-supervised equivariant attention mechanism (SEAM) to discover additional supervision and narrow the gap. Our method is based on the observation that equivariance is an implicit constraint in fully supervised semantic segmentation, whose pixel-level labels take the same spatial transformation as the input images during data augmentation. However, this constraint is lost on the CAMs trained by image-level supervision. Therefore, we propose consistency regularization on predicted CAMs from various transformed images to provide self-supervision for network learning. Moreover, we propose a pixel correlation module (PCM), which exploits context appearance information and refines the prediction of current pixel by its similar neighbors, leading to further improvement on CAMs consistency. Extensive experiments on PASCAL VOC 2012 dataset demonstrate our method outperforms state-of-the-art methods using the same level of supervision. The code is released online.