CFNET：一阶段全景分割的学习相关功能

论文标题

CFNET：一阶段全景分割的学习相关功能

CFNet: Learning Correlation Functions for One-Stage Panoptic Segmentation

论文作者

Chen, Yifeng, Chu, Wenqing, Wang, Fangfang, Tai, Ying, Yi, Ran, Gan, Zhenye, Yao, Liang, Wang, Chengjie, Li, Xi

论文摘要

最近，人们对一阶段的综合分割方法的关注日益加剧，该方法旨在有效地将实例和集成在完全卷积的管道中进行分割。但是，大多数现有作品直接将骨干特征馈送到各种分段头部，而忽略了对语义和实例细分的需求不同：前者需要语义级别的判别特征，而后者则需要在各个实例上可以区分功能。为了减轻这一点，我们建议首先预测不同位置之间用于增强主链特征的不同位置之间的语义级别和实例级相关性，然后分别将改进的判别特征馈入相应的分段头。具体而言，我们将给定位置与所有位置之间的相关性组织为连续序列，并整体上预测它。考虑到这样的序列可能非常复杂，我们采用离散的傅立叶变换（DFT），该工具可以近似于通过振幅和短语参数参数的任意序列。对于不同的任务，我们以完全卷积的方式从骨干特征生成这些参数，该参数通过相应的任务暗中优化。结果，这些准确且一致的相关性有助于产生合理的判别特征，这些特征满足复杂的全景分段任务的要求。为了验证我们的方法的有效性，我们对几个具有挑战性的全景分割数据集进行了实验，并以$ 45.1 $ \％pq和ade20k的价格实现了MS Coco的最先进性能，并使用$ 32.6 $ \％pq。

Recently, there is growing attention on one-stage panoptic segmentation methods which aim to segment instances and stuff jointly within a fully convolutional pipeline efficiently. However, most of the existing works directly feed the backbone features to various segmentation heads ignoring the demands for semantic and instance segmentation are different: The former needs semantic-level discriminative features, while the latter requires features to be distinguishable across instances. To alleviate this, we propose to first predict semantic-level and instance-level correlations among different locations that are utilized to enhance the backbone features, and then feed the improved discriminative features into the corresponding segmentation heads, respectively. Specifically, we organize the correlations between a given location and all locations as a continuous sequence and predict it as a whole. Considering that such a sequence can be extremely complicated, we adopt Discrete Fourier Transform (DFT), a tool that can approximate an arbitrary sequence parameterized by amplitudes and phrases. For different tasks, we generate these parameters from the backbone features in a fully convolutional way which is optimized implicitly by corresponding tasks. As a result, these accurate and consistent correlations contribute to producing plausible discriminative features which meet the requirements of the complicated panoptic segmentation task. To verify the effectiveness of our methods, we conduct experiments on several challenging panoptic segmentation datasets and achieve state-of-the-art performance on MS COCO with $45.1$\% PQ and ADE20k with $32.6$\% PQ.

下载PDF全文

下载文献需遵守相关版权规定

论文标题