Semformer：弱监督语义分段的语义引导激活变压器

论文标题

Semformer：弱监督语义分段的语义引导激活变压器

SemFormer: Semantic Guided Activation Transformer for Weakly Supervised Semantic Segmentation

论文作者

Chen, Junliang, Zhao, Xiaodong, Luo, Cheng, Shen, Linlin

论文摘要

最近的主流弱监督语义分割（WSSS）方法主要基于CNN（卷积神经网络）基于图像分类器生成的类激活图（CAM）。在本文中，我们提出了一个基于变压器的新型框架，称为WSSS的语义引导激活变压器（Semformer）。我们设计了一个基于变压器的类感知自动编码器（CAAE），以提取输入图像的类嵌入，并为所有类别的所有类别学习类语义。然后，使用类嵌入和学习的类语义来指导四个损失的激活图的产生，即类 - 前景，背靠背，激活抑制和激活互补损失。实验结果表明，我们的Semformer实现\ TextBf {74.3} \％Miou，并超过了许多最近的主流WSSS方法，在Pascal VOC 2012数据集上具有很大的利润率。代码将在\ url {https://github.com/jlchen-c/semformer}上找到。

Recent mainstream weakly supervised semantic segmentation (WSSS) approaches are mainly based on Class Activation Map (CAM) generated by a CNN (Convolutional Neural Network) based image classifier. In this paper, we propose a novel transformer-based framework, named Semantic Guided Activation Transformer (SemFormer), for WSSS. We design a transformer-based Class-Aware AutoEncoder (CAAE) to extract the class embeddings for the input image and learn class semantics for all classes of the dataset. The class embeddings and learned class semantics are then used to guide the generation of activation maps with four losses, i.e., class-foreground, class-background, activation suppression, and activation complementation loss. Experimental results show that our SemFormer achieves \textbf{74.3}\% mIoU and surpasses many recent mainstream WSSS approaches by a large margin on PASCAL VOC 2012 dataset. Code will be available at \url{https://github.com/JLChen-C/SemFormer}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题