从部分标记的数据集中进行自我监督的健壮对象检测器

论文标题

从部分标记的数据集中进行自我监督的健壮对象检测器

Self-supervised Robust Object Detectors from Partially Labelled Datasets

论文作者

Abbasi, Mahdieh, Laurendeau, Denis, Gagne, Christian

论文摘要

在对象检测任务中，从相似上下文中合并各种数据集，但具有不同感兴趣的对象集（OOI）是一种廉价的方式（就人工成本而言），用于制作涵盖广泛对象的大规模数据集。此外，合并数据集使我们能够培训一个集成的对象探测器，而不是训练几个探测器，这又导致计算和时间成本的减少。但是，从相似上下文中合并数据集会导致样品具有部分标记，因为每个组成数据集最初是为其自己的OOI集注释的，而忽略了以注释那些在合并数据集后对对象感兴趣的对象。为了训练\ emph {一个具有高概括性能的集成鲁棒对象检测器}，我们提出了一个培训框架，以克服合并的数据集的缺失标签挑战。更具体地说，我们提出了一个计算高效的自我监督框架，以创建合并数据集中未标记的积极实例的正式伪标签，以便在地面真相和伪标记上共同训练对象探测器。我们使用VOC2012和VOC2007评估了在模拟合并数据集上培训YOLO的拟议框架，其中缺少率$ \！48 \％$。我们从经验上表明，在我们的方法中训练的Yolo的概括性能和我们方法创建的伪标签的平均$ 4 \％$比仅使用合并数据集的地面真相标签训练的$ 4 \％$。

In the object detection task, merging various datasets from similar contexts but with different sets of Objects of Interest (OoI) is an inexpensive way (in terms of labor cost) for crafting a large-scale dataset covering a wide range of objects. Moreover, merging datasets allows us to train one integrated object detector, instead of training several ones, which in turn resulting in the reduction of computational and time costs. However, merging the datasets from similar contexts causes samples with partial labeling as each constituent dataset is originally annotated for its own set of OoI and ignores to annotate those objects that are become interested after merging the datasets. With the goal of training \emph{one integrated robust object detector with high generalization performance}, we propose a training framework to overcome missing-label challenge of the merged datasets. More specifically, we propose a computationally efficient self-supervised framework to create on-the-fly pseudo-labels for the unlabeled positive instances in the merged dataset in order to train the object detector jointly on both ground truth and pseudo labels. We evaluate our proposed framework for training Yolo on a simulated merged dataset with missing rate $\approx\!48\%$ using VOC2012 and VOC2007. We empirically show that generalization performance of Yolo trained on both ground truth and the pseudo-labels created by our method is on average $4\%$ higher than the ones trained only with the ground truth labels of the merged dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题