论文标题
CASSPR:交叉注意单扫描场所识别
CASSPR: Cross Attention Single Scan Place Recognition
论文作者
论文摘要
基于点云(LIDAR)的位置识别是自动机器人或自动驾驶车辆的重要组成部分。使用基于点或基于体素的结构,可以在累积的激光雷达子包上实现当前的SOTA性能。尽管基于体素的方法可以很好地整合多个尺度的空间上下文,但它们并未表现出基于点的方法的局部精度。结果,现有的方法与稀疏的单一镜头扫描中微妙的几何特征的细粒度匹配。为了克服这些局限性,我们将CASSPR作为一种使用交叉注意变压器融合基于点和体素的方法的方法。 CASSPR利用一个稀疏的体素分支在较低的分辨率上提取和汇总信息,并在点上划分分支,以获取细粒度的局部信息。 CASSPR使用一个分支中的查询来尝试匹配另一个分支中的结构,以确保两者都提取点云的独立描述符(而不是一个分支主导),但都使用两者都会告知点云的输出全局描述符。广泛的实验表明,CASSPR超过了几个数据集(Oxford Robotcar,TUM,USYD)的最先进。例如,它在TUM数据集上达到了85.6%的AR@1,超过了最强的先前模型约15%。我们的代码公开可用。
Place recognition based on point clouds (LiDAR) is an important component for autonomous robots or self-driving vehicles. Current SOTA performance is achieved on accumulated LiDAR submaps using either point-based or voxel-based structures. While voxel-based approaches nicely integrate spatial context across multiple scales, they do not exhibit the local precision of point-based methods. As a result, existing methods struggle with fine-grained matching of subtle geometric features in sparse single-shot Li- DAR scans. To overcome these limitations, we propose CASSPR as a method to fuse point-based and voxel-based approaches using cross attention transformers. CASSPR leverages a sparse voxel branch for extracting and aggregating information at lower resolution and a point-wise branch for obtaining fine-grained local information. CASSPR uses queries from one branch to try to match structures in the other branch, ensuring that both extract self-contained descriptors of the point cloud (rather than one branch dominating), but using both to inform the output global descriptor of the point cloud. Extensive experiments show that CASSPR surpasses the state-of-the-art by a large margin on several datasets (Oxford RobotCar, TUM, USyd). For instance, it achieves AR@1 of 85.6% on the TUM dataset, surpassing the strongest prior model by ~15%. Our code is publicly available.