PSECO：半监督对象检测的伪标签和一致性培训

论文标题

PSECO：半监督对象检测的伪标签和一致性培训

PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection

论文作者

Li, Gang, Li, Xiang, Wang, Yujie, Wu, Yichao, Liang, Ding, Zhang, Shanshan

论文摘要

在本文中，我们在半监督对象检测（SSOD）中深入研究了两种关键技术，即伪标记和一致性训练。我们观察到，这两种技术目前忽略了对象检测的一些重要特性，从而阻碍了对未标记数据的有效学习。具体而言，对于伪标记，现有作品仅关注分类分数，但无法保证伪盒的本地化精度；对于一致性训练，广泛采用的随机训练只会考虑标签级的一致性，但错过了功能级别的训练，这在确保尺度不变性方面也起着重要作用。为了解决嘈杂的伪箱所产生的问题，我们设计了包括预测引导的标签分配（PLA）和积极验证一致性投票（PCV）的嘈杂伪盒学习（NPL）。 PLA依赖于模型预测来分配标签，并使甚至粗糙的伪框都具有鲁棒性。 PCV利用积极建议的回归一致性来反映伪盒的本地化质量。此外，在一致性训练中，我们提出了包括标签和特征级一致性的机制的多视图量表 - 不变学习（MSL），其中通过将两个图像之间的变化特征金字塔对齐具有相同内容但相同的尺度不同。在可可基准测试中，我们的方法称为伪标记和一致性培训（PSECO），分别以2.0、1.8、2.0分的1％，5％和10％的标记比优于SOTA（软教师）。它还显着提高了SSOD的学习效率，例如，PSECO将SOTA方法的训练时间减半，但其性能甚至更好。代码可在https://github.com/ligang-cs/pseco上找到。

In this paper, we delve into two key techniques in Semi-Supervised Object Detection (SSOD), namely pseudo labeling and consistency training. We observe that these two techniques currently neglect some important properties of object detection, hindering efficient learning on unlabeled data. Specifically, for pseudo labeling, existing works only focus on the classification score yet fail to guarantee the localization precision of pseudo boxes; For consistency training, the widely adopted random-resize training only considers the label-level consistency but misses the feature-level one, which also plays an important role in ensuring the scale invariance. To address the problems incurred by noisy pseudo boxes, we design Noisy Pseudo box Learning (NPL) that includes Prediction-guided Label Assignment (PLA) and Positive-proposal Consistency Voting (PCV). PLA relies on model predictions to assign labels and makes it robust to even coarse pseudo boxes; while PCV leverages the regression consistency of positive proposals to reflect the localization quality of pseudo boxes. Furthermore, in consistency training, we propose Multi-view Scale-invariant Learning (MSL) that includes mechanisms of both label- and feature-level consistency, where feature consistency is achieved by aligning shifted feature pyramids between two images with identical content but varied scales. On COCO benchmark, our method, termed PSEudo labeling and COnsistency training (PseCo), outperforms the SOTA (Soft Teacher) by 2.0, 1.8, 2.0 points under 1%, 5%, and 10% labelling ratios, respectively. It also significantly improves the learning efficiency for SSOD, e.g., PseCo halves the training time of the SOTA approach but achieves even better performance. Code is available at https://github.com/ligang-cs/PseCo.

下载PDF全文

下载文献需遵守相关版权规定

论文标题