CVFNET：通过学习交叉视图功能实时3D对象检测

论文标题

CVFNET：通过学习交叉视图功能实时3D对象检测

CVFNet: Real-time 3D Object Detection by Learning Cross View Features

论文作者

Gu, Jiaqi, Xiang, Zhiyu, Zhao, Pan, Bai, Tingming, Wang, Lingxuan, Zhao, Xijun, Zhang, Zhiyuan

论文摘要

近年来，由于深度学习技术的发展，LiDar Point Clouds的3D对象检测取得了巨大进展。尽管基于体素或基于点的方法在3D对象检测中很受欢迎，但它们通常涉及耗时的操作，例如有关体素的3D卷积或点之间的球查询，从而使所得网络不适合时间关键应用程序。另一方面，基于2D视图的方法具有较高的计算效率，而通常比基于体素或基于点的方法获得的性能低。在这项工作中，我们提出了一个基于实时视图的单阶段3D对象检测器，即CVFNET完成此任务。为了在苛刻的效率条件下增强跨视图特征学习，我们的框架提取了不同视图的特征，并以有效的渐进式方式融合了它们。我们首先提出了一个新颖的点范围特征融合模块，该模块在多个阶段深入整合点和范围视图特征。然后，在将所获得的深点视图转换为鸟类视图时，特殊的切片支柱旨在很好地维护3D几何形状。为了更好地平衡样品比率，提出了一个稀疏的支柱检测头，将检测集中在非空网上。我们对流行的Kitti和Nuscenes基准进行了实验，并以准确性和速度来实现最先进的性能。

In recent years 3D object detection from LiDAR point clouds has made great progress thanks to the development of deep learning technologies. Although voxel or point based methods are popular in 3D object detection, they usually involve time-consuming operations such as 3D convolutions on voxels or ball query among points, making the resulting network inappropriate for time critical applications. On the other hand, 2D view-based methods feature high computing efficiency while usually obtaining inferior performance than the voxel or point based methods. In this work, we present a real-time view-based single stage 3D object detector, namely CVFNet to fulfill this task. To strengthen the cross-view feature learning under the condition of demanding efficiency, our framework extracts the features of different views and fuses them in an efficient progressive way. We first propose a novel Point-Range feature fusion module that deeply integrates point and range view features in multiple stages. Then, a special Slice Pillar is designed to well maintain the 3D geometry when transforming the obtained deep point-view features into bird's eye view. To better balance the ratio of samples, a sparse pillar detection head is presented to focus the detection on the nonempty grids. We conduct experiments on the popular KITTI and NuScenes benchmark, and state-of-the-art performances are achieved in terms of both accuracy and speed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题