从合成训练数据中学习在新环境中进行本地化

论文标题

从合成训练数据中学习在新环境中进行本地化

Learning to Localize in New Environments from Synthetic Training Data

论文作者

Winkelbauer, Dominik, Denninger, Maximilian, Triebel, Rudolph

论文摘要

大多数现有的视觉定位方法都需要环境的详细3D模型，或者在基于学习的方法的情况下，必须为每个新场景重新训练。对于大型未知环境而言，这可能是非常昂贵的，或者根本不可能，例如在搜索和救援方案中。尽管有一些基于学习的方法可以毫不及格地进行场景，但这些方法的概括能力仍然优于经典方法。在本文中，我们提出了一种可以通过对模型体系结构应用特定更改（包括扩展回归部分，层次相关层的使用以及对规模和不确定性信息的开发）的特定更改来推广到新场景的方法。我们的方法在同样大图像上使用SIFT功能优于5点算法，并超过了对不同数据培训的所有基于学习的方法。它也优于大多数在各自场景中专门训练的方法。我们还在仅提供很少的参考图像的情况下评估了我们的方法，这表明在更现实的条件下，基于学习的方法大大超过了现有的基于学习的方法和经典方法。

Most existing approaches for visual localization either need a detailed 3D model of the environment or, in the case of learning-based methods, must be retrained for each new scene. This can either be very expensive or simply impossible for large, unknown environments, for example in search-and-rescue scenarios. Although there are learning-based approaches that operate scene-agnostically, the generalization capability of these methods is still outperformed by classical approaches. In this paper, we present an approach that can generalize to new scenes by applying specific changes to the model architecture, including an extended regression part, the use of hierarchical correlation layers, and the exploitation of scale and uncertainty information. Our approach outperforms the 5-point algorithm using SIFT features on equally big images and additionally surpasses all previous learning-based approaches that were trained on different data. It is also superior to most of the approaches that were specifically trained on the respective scenes. We also evaluate our approach in a scenario where only very few reference images are available, showing that under such more realistic conditions our learning-based approach considerably exceeds both existing learning-based and classical methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题