论文标题

找到(3D)中心:使用学习损失的3D对象检测

Finding Your (3D) Center: 3D Object Detection Using a Learned Loss

论文作者

Griffiths, David, Boehm, Jan, Ritschel, Tobias

论文摘要

但是,对于2D图像,很难为3D场景提供大量标记的数据集。像Shapenet这样的3D存储库中的对象被标记了,但遗憾的是,只有孤立,因此没有上下文。 3D场景可以通过城市级别的范围扫描仪获得,但具有语义标签的范围要少得多。在解决这种差异时,我们引入了一个新的优化过程,该过程允许使用RAW 3D扫描进行3D检测的培训,同时使用了几乎5%的对象标签,并且仍然可以实现可比的性能。我们的优化使用两个网络。场景网络将整个3D场景映射到一组3D对象中心。由于我们认为场景不被中心标记,因此没有经典的损失,例如倒角来训练它。相反,我们使用另一个网络模仿损失。该损失网络在一个小标记的子集上进行了训练,并在分散注意力的情况下将非中心的3D对象映射到其自己的中心。该功能非常相似 - 因此可以使用而不是 - 监督损失所提供的梯度。我们的评估文件记录了竞争性的忠诚度,分别在低得多的监督水平上,在可比的监督下质量更高。补充材料可以在以下网址找到:https://dgriffiths3.github.io。

Massive semantically labeled datasets are readily available for 2D images, however, are much harder to achieve for 3D scenes. Objects in 3D repositories like ShapeNet are labeled, but regrettably only in isolation, so without context. 3D scenes can be acquired by range scanners on city-level scale, but much fewer with semantic labels. Addressing this disparity, we introduce a new optimization procedure, which allows training for 3D detection with raw 3D scans while using as little as 5% of the object labels and still achieve comparable performance. Our optimization uses two networks. A scene network maps an entire 3D scene to a set of 3D object centers. As we assume the scene not to be labeled by centers, no classic loss, such as Chamfer can be used to train it. Instead, we use another network to emulate the loss. This loss network is trained on a small labeled subset and maps a non centered 3D object in the presence of distractions to its own center. This function is very similar - and hence can be used instead of - the gradient the supervised loss would provide. Our evaluation documents competitive fidelity at a much lower level of supervision, respectively higher quality at comparable supervision. Supplementary material can be found at: https://dgriffiths3.github.io.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源