PONI：通过无互动学习的对象导航的潜在功能

论文标题

PONI：通过无互动学习的对象导航的潜在功能

PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning

论文作者

Ramakrishnan, Santhosh Kumar, Chaplot, Devendra Singh, Al-Halah, Ziad, Malik, Jitendra, Grauman, Kristen

论文摘要

对象目标导航的最新方法依赖于增强学习，通常需要大量的计算资源和学习时间。我们提出了使用无互动学习（PONI）的对象导航的潜在功能，这是一种模块化方法，可以散布“在哪里看？”的技能？对于对象和“如何导航到（x，y）？”。我们的主要见解是“在哪里看？”可以纯粹将其视为感知问题，而在没有环境相互作用的情况下学习。为了解决这个问题，我们提出了一个网络，该网络可以预测两个在语义图上的互补电位功能，并使用它们来决定在哪里寻找看不见的对象。我们使用在自上而下的语义图的被动数据集上使用监督的学习来训练潜在的功能网络，并将其集成到模块化框架中以执行对象goal导航。 Gibson和MatterPort3D的实验表明，我们的方法可以实现对象目标导航的最新方法，同时减少培训计算成本高达1,600倍。可以使用代码和预训练的模型：https：//vision.cs.utexas.edu/projects/poni/

State-of-the-art approaches to ObjectGoal navigation rely on reinforcement learning and typically require significant computational resources and time for learning. We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI), a modular approach that disentangles the skills of `where to look?' for an object and `how to navigate to (x, y)?'. Our key insight is that `where to look?' can be treated purely as a perception problem, and learned without environment interactions. To address this, we propose a network that predicts two complementary potential functions conditioned on a semantic map and uses them to decide where to look for an unseen object. We train the potential function network using supervised learning on a passive dataset of top-down semantic maps, and integrate it into a modular framework to perform ObjectGoal navigation. Experiments on Gibson and Matterport3D demonstrate that our method achieves the state-of-the-art for ObjectGoal navigation while incurring up to 1,600x less computational cost for training. Code and pre-trained models are available: https://vision.cs.utexas.edu/projects/poni/

下载PDF全文

下载文献需遵守相关版权规定

论文标题