通过2D监督迈向3D对象检测

论文标题

通过2D监督迈向3D对象检测

Towards 3D Object Detection with 2D Supervision

论文作者

Yang, Jinrong, Wang, Tiancai, Ge, Zheng, Mao, Weixin, Li, Xiaoping, Zhang, Xiangyu

论文摘要

3D对象检测器的巨大进展取决于大规模数据和3D注释。 3D边界盒的注释成本非常昂贵，而2D的注释成本更容易且便宜。在本文中，我们引入了一个混合训练框架，使我们能够学习具有大量2D（伪）标签的视觉3D对象检测器，即使没有3D注释。为了突破2D线索的信息瓶颈，我们探讨了一个新的视角：时间2D监督。我们提出了一个时间2D变换，以用时间2D标签桥接3D预测。两个步骤，包括同型包装和2D盒扣除，用于将3D预测转换为2D预测以进行监督。在Nuscenes数据集上进行的实验显示出仅25％3D注释的效果（占其完全监督性能的近90％）。我们希望我们的发现可以为使用大量的2D注释提供3D感知的新见解。

The great progress of 3D object detectors relies on large-scale data and 3D annotations. The annotation cost for 3D bounding boxes is extremely expensive while the 2D ones are easier and cheaper to collect. In this paper, we introduce a hybrid training framework, enabling us to learn a visual 3D object detector with massive 2D (pseudo) labels, even without 3D annotations. To break through the information bottleneck of 2D clues, we explore a new perspective: Temporal 2D Supervision. We propose a temporal 2D transformation to bridge the 3D predictions with temporal 2D labels. Two steps, including homography wraping and 2D box deduction, are taken to transform the 3D predictions into 2D ones for supervision. Experiments conducted on the nuScenes dataset show strong results (nearly 90% of its fully-supervised performance) with only 25% 3D annotations. We hope our findings can provide new insights for using a large number of 2D annotations for 3D perception.

下载PDF全文

下载文献需遵守相关版权规定

论文标题