论文标题
单眼3D对象检测具有顺序特征关联和深度提示增强
Monocular 3D Object Detection with Sequential Feature Association and Depth Hint Augmentation
论文作者
论文摘要
单眼3D对象检测,目的是预测公路对象的几何特性,是智能自主驾驶感知系统的有前途的研究主题。大多数最先进的方法遵循基于关键点的范式,其中对象的关键被预测并用作回归其他几何特性的基础。在这项工作中,提出了一个名为Fadnet的统一网络,以解决单眼3D对象检测的任务。与以前的基于按键的方法相反,我们建议根据对象属性的估计难度将输出模态分为不同的组。不同的组通过卷积的封闭式复发单位对不同的治疗方式进行了不同和顺序相关的处理。这项工作的另一个贡献是深度提示增强的策略。为了提供特征的深度模式作为深度估计的提示,一个专用的深度提示模块旨在生成称为深度提示的行智能特征,这些特征以bin的方式明确监督。通过对Kitti基准进行实验和消融研究来验证这项工作的贡献。我们的网络在不利用深度先验,后优化或其他改进模块的情况下,在保持不错的跑步速度的同时,我们的网络会在最先进的方法上发挥作用。
Monocular 3D object detection, with the aim of predicting the geometric properties of on-road objects, is a promising research topic for the intelligent perception systems of autonomous driving. Most state-of-the-art methods follow a keypoint-based paradigm, where the keypoints of objects are predicted and employed as the basis for regressing the other geometric properties. In this work, a unified network named as FADNet is presented to address the task of monocular 3D object detection. In contrast to previous keypoint-based methods, we propose to divide the output modalities into different groups according to the estimation difficulty of object properties. Different groups are treated differently and sequentially associated by a convolutional Gated Recurrent Unit. Another contribution of this work is the strategy of depth hint augmentation. To provide characterized depth patterns as hints for depth estimation, a dedicated depth hint module is designed to generate row-wise features named as depth hints, which are explicitly supervised in a bin-wise manner. The contributions of this work are validated by conducting experiments and ablation study on the KITTI benchmark. Without utilizing depth priors, post optimization, or other refinement modules, our network performs competitively against state-of-the-art methods while maintaining a decent running speed.