通过在线改编机器人勘探在具有挑战性的环境中进行的非政策评估

论文标题

通过在线改编机器人勘探在具有挑战性的环境中进行的非政策评估

Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments

论文作者

Hu, Yafei, Geng, Junyi, Wang, Chen, Keller, John, Scherer, Sebastian

论文摘要

自主探索有许多重要的应用。但是，经典信息基于信息获得或基于边界的探索仅依赖于机器人的当前状态来确定即时勘探目标，这缺乏预测未来状态价值的能力，从而导致效率低下的勘探决策。本文介绍了一种学习方法，以国家价值函数衡量“良好”状态如何为在现实世界中挑战性环境中提供机器人勘探的指导。我们将工作作为机器人勘探（OPERE）的政策评估（OPE）问题。它包括离线蒙特卡洛对现实数据的培训，并执行时间差异（TD）在线适应以优化训练的价值估计器。我们还根据传感器信息覆盖范围设计了固有的奖励功能，以使机器人能够以稀疏的外部奖励获得更多信息。结果表明，我们的方法使机器人能够预测未来状态的价值，从而更好地指导机器人探索。与最先进的算法相比，所提出的算法可以实现更好的预测和勘探性能。据我们所知，这项工作首次展示了在具有挑战性的地下和城市环境中机器人探索的现实数据集上的价值功能预测。可以在https://jeffreyyh.github.io/opere/上找到更多详细信息和演示视频。

Autonomous exploration has many important applications. However, classic information gain-based or frontier-based exploration only relies on the robot current state to determine the immediate exploration goal, which lacks the capability of predicting the value of future states and thus leads to inefficient exploration decisions. This paper presents a method to learn how "good" states are, measured by the state value function, to provide a guidance for robot exploration in real-world challenging environments. We formulate our work as an off-policy evaluation (OPE) problem for robot exploration (OPERE). It consists of offline Monte-Carlo training on real-world data and performs Temporal Difference (TD) online adaptation to optimize the trained value estimator. We also design an intrinsic reward function based on sensor information coverage to enable the robot to gain more information with sparse extrinsic rewards. Results show that our method enables the robot to predict the value of future states so as to better guide robot exploration. The proposed algorithm achieves better prediction and exploration performance compared with the state-of-the-arts. To the best of our knowledge, this work for the first time demonstrates value function prediction on real-world dataset for robot exploration in challenging subterranean and urban environments. More details and demo videos can be found at https://jeffreyyh.github.io/opere/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题