赫特伦：通过重尾适应算法的高效户外导航，稀疏奖励稀疏奖励

论文标题

赫特伦：通过重尾适应算法的高效户外导航，稀疏奖励稀疏奖励

HTRON:Efficient Outdoor Navigation with Sparse Rewards via Heavy Tailed Adaptive Reinforce Algorithm

论文作者

Weerakoon, Kasun, Chakraborty, Souradip, Karapetyan, Nare, Sathyamoorthy, Adarsh Jagan, Bedi, Amrit Singh, Manocha, Dinesh

论文摘要

我们提出了一种新的方法，可以提高基于深度强化学习（DRL）的室外机器人导航系统的性能。大多数现有的DRL方法基于精心设计的密集奖励功能，这些功能可以学习环境中的有效行为。我们仅通过稀疏的奖励（易于设计）来解决这个问题，并提出了一种新型的自适应重尾强化算法，用于户外导航，称为Htron。我们的主要思想是利用重尾政策参数化，这些参数隐含在稀疏的奖励设置中探索。我们在三种不同的室外场景中评估了针对增强，PPO和TRPO算法的HTRON的性能：进球，避免障碍和地形导航不均匀。我们平均观察到成功率平均增加了34.41％，与其他方法相比，高度步骤所需的平均时间步骤减少了15.15％，高程成本下降了24.9％。此外，我们证明我们的算法可以直接转移到Clearpath Husky机器人中，以在现实世界中进行户外地形导航。

We present a novel approach to improve the performance of deep reinforcement learning (DRL) based outdoor robot navigation systems. Most, existing DRL methods are based on carefully designed dense reward functions that learn the efficient behavior in an environment. We circumvent this issue by working only with sparse rewards (which are easy to design), and propose a novel adaptive Heavy-Tailed Reinforce algorithm for Outdoor Navigation called HTRON. Our main idea is to utilize heavy-tailed policy parametrizations which implicitly induce exploration in sparse reward settings. We evaluate the performance of HTRON against Reinforce, PPO and TRPO algorithms in three different outdoor scenarios: goal-reaching, obstacle avoidance, and uneven terrain navigation. We observe in average an increase of 34.41% in terms of success rate, a 15.15% decrease in the average time steps taken to reach the goal, and a 24.9% decrease in the elevation cost compared to the navigation policies obtained by the other methods. Further, we demonstrate that our algorithm can be transferred directly into a Clearpath Husky robot to perform outdoor terrain navigation in real-world scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题