部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation

论文作者

Chen, Peihao, Ji, Dongyu, Lin, Kunyang, Zeng, Runhao, Li, Thomas H., Tan, Mingkui, Gan, Chuang

论文摘要

我们解决了一个实用但充满挑战的问题，即训练机器人代理在某些语言说明所描述的路径下在环境中导航。指令通常包含环境中对象的描述。为了实现准确，有效的导航，至关重要的是要构建准确代表空间位置和环境对象的语义信息的地图。但是，使机器人能够构建一个很好地代表环境的地图极具挑战性，因为环境通常涉及具有各种属性的各种对象。在本文中，我们提出了一个多粒度图，其中包含对象细粒细节（例如颜色，纹理）和语义类别，以更全面地表示对象。此外，我们提出了一项弱监督的辅助任务，该任务要求代理在地图上本地化指令对象。通过此任务，代理商不仅学会了定位与指令相关的对象进行导航，还鼓励学习一个更好的地图表示，以揭示对象信息。然后，我们将学习的地图和指令馈送到Waypoint预测器中，以确定下一个导航目标。实验结果表明，我们的方法的表现优于最新方法4.0％和4.6％W.R.T.在VLN-CE数据集上分别在可见和看不见的环境中的成功率。代码可在https://github.com/peihaochen/ws-mgmap上找到。

We address a practical yet challenging problem of training robot agents to navigate in an environment following a path described by some language instructions. The instructions often contain descriptions of objects in the environment. To achieve accurate and efficient navigation, it is critical to build a map that accurately represents both spatial location and the semantic information of the environment objects. However, enabling a robot to build a map that well represents the environment is extremely challenging as the environment often involves diverse objects with various attributes. In this paper, we propose a multi-granularity map, which contains both object fine-grained details (e.g., color, texture) and semantic classes, to represent objects more comprehensively. Moreover, we propose a weakly-supervised auxiliary task, which requires the agent to localize instruction-relevant objects on the map. Through this task, the agent not only learns to localize the instruction-relevant objects for navigation but also is encouraged to learn a better map representation that reveals object information. We then feed the learned map and instruction to a waypoint predictor to determine the next navigation goal. Experimental results show our method outperforms the state-of-the-art by 4.0% and 4.6% w.r.t. success rate both in seen and unseen environments, respectively on VLN-CE dataset. Code is available at https://github.com/PeihaoChen/WS-MGMap.

下载PDF全文

下载文献需遵守相关版权规定

论文标题