论文标题
递归交叉视图:仅使用2D检测器实现3D对象检测而无需3D注释
Recursive Cross-View: Use Only 2D Detectors to Achieve 3D Object Detection without 3D Annotations
论文作者
论文摘要
严重依赖3D注释限制了3D对象检测的现实应用。在本文中,我们提出了一种不需要任何3D注释的方法,同时能够预测完全取向的3D边界框。我们的方法称为递归跨视图(RCV),利用三视图原理将3D检测转换为多个2D检测任务,仅需要2D标签的子集。我们提出了一个递归范式,在该实例分割和3D边界框通过跨视图生成,直到收敛为止。具体而言,我们提出的方法涉及为每个2D边界框使用Frustum,然后是递归范式,最终生成一个完全取向的3D框,以及其相应的类别和得分。请注意,班级和分数由2D检测器给出。在Sun RGB-D和Kitti数据集上估计,我们的方法优于现有的基于图像的方法。为了证明我们的方法可以快速用于新任务,我们将其在两个现实的情况下实施,即3D人类检测和3D手发现。结果,获得了两个新的3D注释数据集,这意味着可以将RCV视为(半)自动3D注释。此外,我们将RCV部署在深度传感器上,该传感器在实时RGB-D流中以7 fps的形式实现检测。 RCV是第一种3D检测方法,可在不消耗3D标签的情况下产生完全取向的3D框。
Heavily relying on 3D annotations limits the real-world application of 3D object detection. In this paper, we propose a method that does not demand any 3D annotation, while being able to predict fully oriented 3D bounding boxes. Our method, called Recursive Cross-View (RCV), utilizes the three-view principle to convert 3D detection into multiple 2D detection tasks, requiring only a subset of 2D labels. We propose a recursive paradigm, in which instance segmentation and 3D bounding box generation by Cross-View are implemented recursively until convergence. Specifically, our proposed method involves the use of a frustum for each 2D bounding box, which is then followed by the recursive paradigm that ultimately generates a fully oriented 3D box, along with its corresponding class and score. Note that, class and score are given by the 2D detector. Estimated on the SUN RGB-D and KITTI datasets, our method outperforms existing image-based approaches. To justify that our method can be quickly used to new tasks, we implement it on two real-world scenarios, namely 3D human detection and 3D hand detection. As a result, two new 3D annotated datasets are obtained, which means that RCV can be viewed as a (semi-) automatic 3D annotator. Furthermore, we deploy RCV on a depth sensor, which achieves detection at 7 fps on a live RGB-D stream. RCV is the first 3D detection method that yields fully oriented 3D boxes without consuming 3D labels.