通过多尺度前后背景集成进行协作视频对象分割

论文标题

通过多尺度前后背景集成进行协作视频对象分割

Collaborative Video Object Segmentation by Multi-Scale Foreground-Background Integration

论文作者

Yang, Zongxin, Wei, Yunchao, Yang, Yi

论文摘要

本文研究了嵌入学习的原则，以解决具有挑战性的半监督视频对象细分。与以前专注于探索前景对象学习的嵌入学习的实践不同，我们认为应该同样对待背景。因此，我们通过前景 - 背景集成（CFBI）方法提出了一个协作视频对象分割。 CFBI将嵌入到前景对象区域及其相应背景区域的特征分开，隐含地促进它们更为对比，并改善分割结果。此外，CFBI在参考序列和预测序列之间执行了像素级匹配过程和实例级别的注意机制，从而使CFBI对各种对象尺度稳健。基于CFBI，我们引入了多尺度匹配结构，并提出了一种非常匹配的策略，从而产生了更强大，更有效的框架CFBI+。我们在两个流行的基准测试（即戴维斯和YouTube-Vos）上进行了广泛的实验。在不应用任何模拟数据进行预训练的情况下，我们的CFBI+实现了82.9％和82.8％的性能（J＆F），表现优于所有其他最先进的方法。代码：https：//github.com/z-x-yang/cfbi。

This paper investigates the principles of embedding learning to tackle the challenging semi-supervised video object segmentation. Unlike previous practices that focus on exploring the embedding learning of foreground object (s), we consider background should be equally treated. Thus, we propose a Collaborative video object segmentation by Foreground-Background Integration (CFBI) approach. CFBI separates the feature embedding into the foreground object region and its corresponding background region, implicitly promoting them to be more contrastive and improving the segmentation results accordingly. Moreover, CFBI performs both pixel-level matching processes and instance-level attention mechanisms between the reference and the predicted sequence, making CFBI robust to various object scales. Based on CFBI, we introduce a multi-scale matching structure and propose an Atrous Matching strategy, resulting in a more robust and efficient framework, CFBI+. We conduct extensive experiments on two popular benchmarks, i.e., DAVIS and YouTube-VOS. Without applying any simulated data for pre-training, our CFBI+ achieves the performance (J&F) of 82.9% and 82.8%, outperforming all the other state-of-the-art methods. Code: https://github.com/z-x-yang/CFBI.

下载PDF全文

下载文献需遵守相关版权规定

论文标题