为什么如此深：朝着增强以前训练的模型以识别视觉位置

论文标题

为什么如此深：朝着增强以前训练的模型以识别视觉位置

Why-So-Deep: Towards Boosting Previously Trained Models for Visual Place Recognition

论文作者

Bhutta, M. Usman Maqbool, Sun, Yuxiang, Lau, Darwin, Liu, Ming

论文摘要

循环闭合检测的基于深度学习的图像检索技术表明性能令人满意。但是，基于不同地理区域中先前训练的模型，实现高级绩效仍然具有挑战性。本文解决了他们在新环境中同时本地化和映射系统的部署问题。通用基线方法使用其他信息，例如GPS，顺序键帧跟踪以及重新训练整个环境以提高召回率。我们提出了一种基于先前训练的模型来改善图像检索的新方法。我们提出了一种智能方法，即Maqbool，以扩大预训练模型的功能，以获得更好的图像召回及其在实时多型大满贯系统中的应用。与最新方法的高描述符（4096-D）相比，我们在低描述符维度（512-D）处获得了可比的图像检索结果。我们使用空间信息来提高预训练模型的图像检索中的召回率。

Deep learning-based image retrieval techniques for the loop closure detection demonstrate satisfactory performance. However, it is still challenging to achieve high-level performance based on previously trained models in different geographical regions. This paper addresses the problem of their deployment with simultaneous localization and mapping (SLAM) systems in the new environment. The general baseline approach uses additional information, such as GPS, sequential keyframes tracking, and re-training the whole environment to enhance the recall rate. We propose a novel approach for improving image retrieval based on previously trained models. We present an intelligent method, MAQBOOL, to amplify the power of pre-trained models for better image recall and its application to real-time multiagent SLAM systems. We achieve comparable image retrieval results at a low descriptor dimension (512-D), compared to the high descriptor dimension (4096-D) of state-of-the-art methods. We use spatial information to improve the recall rate in image retrieval on pre-trained models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题