场景细分的上下文

论文标题

场景细分的上下文

Context Prior for Scene Segmentation

论文作者

Yu, Changqian, Wang, Jingbo, Gao, Changxin, Yu, Gang, Shen, Chunhua, Sang, Nong

论文摘要

最近的工作已广泛探索了上下文依赖性，以实现更准确的分割结果。但是，大多数方法很少区分不同类型的上下文依赖性，这可能会污染场景的理解。在这项工作中，我们直接监督特征聚合，以清楚区分阶层和类间环境。具体来说，我们在对亲和力损失的监督下建立了背景。给定输入图像和相应的地面真理，亲和力损失构建了一个理想的亲和力图，以监督上下文的学习。博学的上下文先验提取属于同一类别的像素，而相反的先验则集中在不同类别的像素上。嵌入到常规的深CNN中，提出的上下文先验层可以选择性地捕获阶级和类间的上下文依赖性，从而导致鲁棒的特征表示。为了验证有效性，我们设计了一个有效的上下文先验网络（CPNET）。广泛的定量和定性评估表明，所提出的模型对最新的语义分割方法有利。更具体地说，我们的算法在ADE20K上实现了46.3％的MIOU，Pascal-Context上的53.9％MIOU和CityScapes上的81.3％MIOU。代码可在https://git.io/contextprior上找到。

Recent works have widely explored the contextual dependencies to achieve more accurate segmentation results. However, most approaches rarely distinguish different types of contextual dependencies, which may pollute the scene understanding. In this work, we directly supervise the feature aggregation to distinguish the intra-class and inter-class context clearly. Specifically, we develop a Context Prior with the supervision of the Affinity Loss. Given an input image and corresponding ground truth, Affinity Loss constructs an ideal affinity map to supervise the learning of Context Prior. The learned Context Prior extracts the pixels belonging to the same category, while the reversed prior focuses on the pixels of different classes. Embedded into a conventional deep CNN, the proposed Context Prior Layer can selectively capture the intra-class and inter-class contextual dependencies, leading to robust feature representation. To validate the effectiveness, we design an effective Context Prior Network (CPNet). Extensive quantitative and qualitative evaluations demonstrate that the proposed model performs favorably against state-of-the-art semantic segmentation approaches. More specifically, our algorithm achieves 46.3% mIoU on ADE20K, 53.9% mIoU on PASCAL-Context, and 81.3% mIoU on Cityscapes. Code is available at https://git.io/ContextPrior.

下载PDF全文

下载文献需遵守相关版权规定

论文标题