从1M图像中学习基于检索的本地化的学习条件不变特征

论文标题

从1M图像中学习基于检索的本地化的学习条件不变特征

Learning Condition Invariant Features for Retrieval-Based Localization from 1M Images

论文作者

Thoma, Janine, Paudel, Danda Pani, Chhatkuli, Ajad, Van Gool, Luc

论文摘要

基于检索的本地化的图像特征必须不变，而动态物体（例如汽车）以及季节性和白天变化。在某种程度上，使用类似三重态的损失的现有方法在一定程度上可以学习，并且给定了大量不同的培训图像。但是，由于算法训练的复杂性很高，因此在大数据集上不同损失函数之间的比较不足。在本文中，我们在三个不同的基准数据集上培训并评估了几种本地化方法，其中包括具有超过一百万张图像的牛津机器人。这种大规模的评估可以对基于检索的本地化的普遍性和性能产生宝贵的见解。根据我们的发现，我们开发了一种新颖的方法来学习更准确，更好地推广本地化特征。它由两个主要贡献组成：（i）基于特征的损失函数，以及（ii）硬呈正和成对负面挖掘。在具有挑战性的牛津机器人夜间条件下，我们的方法在5m之内的本地化准确性优于众所周知的三重态损失。

Image features for retrieval-based localization must be invariant to dynamic objects (e.g. cars) as well as seasonal and daytime changes. Such invariances are, up to some extent, learnable with existing methods using triplet-like losses, given a large number of diverse training images. However, due to the high algorithmic training complexity, there exists insufficient comparison between different loss functions on large datasets. In this paper, we train and evaluate several localization methods on three different benchmark datasets, including Oxford RobotCar with over one million images. This large scale evaluation yields valuable insights into the generalizability and performance of retrieval-based localization. Based on our findings, we develop a novel method for learning more accurate and better generalizing localization features. It consists of two main contributions: (i) a feature volume-based loss function, and (ii) hard positive and pairwise negative mining. On the challenging Oxford RobotCar night condition, our method outperforms the well-known triplet loss by 24.4% in localization accuracy within 5m.

下载PDF全文

下载文献需遵守相关版权规定

论文标题