使用基于高分辨率图像的深度学习的激光雷达数据富集：一种使用低成本激光雷达实现高性能激光雷达的方法

论文标题

使用基于高分辨率图像的深度学习的激光雷达数据富集：一种使用低成本激光雷达实现高性能激光雷达的方法

LiDAR Data Enrichment Using Deep Learning Based on High-Resolution Image: An Approach to Achieve High-Performance LiDAR SLAM Using Low-cost LiDAR

论文作者

Yue, Jiang, Wen, Weisong, Han, Jing, Hsu, Li-Ta

论文摘要

在过去的几十年中，对基于激光雷达的SLAM算法进行了广泛的研究，可为自动驾驶车辆（ADV）提供稳健而准确的定位。可以使用带有64个通道的高级3D激光拉尔德（3D LiDAR）获得令人满意的性能，这可以提供致密的点云。不幸的是，高价极大地阻止了其在ADV中的广泛商业化。具有16个通道的成本效益的3D激光雷达是一个有希望的替代品。但是，只有16个通道激光雷达只能提供有限和稀疏点云，这无法保证在挑战性动态环境中为ADV提供足够的定位精度。低成本相机的高分辨率图像可以提供有关周围环境的充足信息。但是，显式深度信息无法从图像中获得。受到3D激光雷达和相机的互补性的启发，本文提议利用来自相机的高分辨率图像，以根据先进的深度学习算法从低成本16通道LiDAR中富含原始的3D点云。首先采用ERFNET借助原始的稀疏3D点云来分割图像。同时，使用稀疏的卷积神经网络来预测基于原始稀疏3D点云的致密点云。然后，使用新型的多层卷积神经网络将预测的密度云与ERFNET的分割输出融合在一起，以完善预测的3D点云。最后，使用富集的点云根据最先进的正态分布变换（NDT）执行激光震动。我们在重新编辑的KITTI数据集上测试了我们的方法：（1）稀疏的3D点云显着丰富，均方根误差为110万MSE。（2）从激光雷达大满贯产生的地图很稠密，其中包含更多细节而没有明显的准确性损失。

LiDAR-based SLAM algorithms are extensively studied to providing robust and accurate positioning for autonomous driving vehicles (ADV) in the past decades. Satisfactory performance can be obtained using high-grade 3D LiDAR with 64 channels, which can provide dense point clouds. Unfortunately, the high price significantly prevents its extensive commercialization in ADV. The cost-effective 3D LiDAR with 16 channels is a promising replacement. However, only limited and sparse point clouds can be provided by the 16 channels LiDAR, which cannot guarantee sufficient positioning accuracy for ADV in challenging dynamic environments. The high-resolution image from the low-cost camera can provide ample information about the surroundings. However, the explicit depth information is not available from the image. Inspired by the complementariness of 3D LiDAR and camera, this paper proposes to make use of the high-resolution images from a camera to enrich the raw 3D point clouds from the low-cost 16 channels LiDAR based on a state-of-the-art deep learning algorithm. An ERFNet is firstly employed to segment the image with the aid of the raw sparse 3D point clouds. Meanwhile, the sparse convolutional neural network is employed to predict the dense point clouds based on raw sparse 3D point clouds. Then, the predicted dense point clouds are fused with the segmentation outputs from ERFnet using a novel multi-layer convolutional neural network to refine the predicted 3D point clouds. Finally, the enriched point clouds are employed to perform LiDAR SLAM based on the state-of-the-art normal distribution transform (NDT). We tested our approach on the re-edited KITTI datasets: (1)the sparse 3D point clouds are significantly enriched with a mean square error of 1.1m MSE. (2)the map generated from the LiDAR SLAM is denser which includes more details without significant accuracy loss.

下载PDF全文

下载文献需遵守相关版权规定

论文标题