使用运动原始的占用价值，基于动态环境中的基于强化学习的强化学习导航

论文标题

使用运动原始的占用价值，基于动态环境中的基于强化学习的强化学习导航

Deep Reinforcement Learning based Robot Navigation in Dynamic Environments using Occupancy Values of Motion Primitives

论文作者

Akmandor, Neşet Ünver, Li, Hongyu, Lvov, Gary, Dusel, Eric, Padır, Taşkın

论文摘要

本文介绍了一种基于强化的学习导航方法，在其中我们将占用观测定义为运动原始人的启发式评估，而不是使用原始传感器数据。我们的方法使多传感器融合生成的占用数据可以快速映射到3D工作区中的轨迹值。计算有效的轨迹评估允许对动作空间进行密集的采样。我们利用不同数据结构中的占用观测来分析其对培训过程和导航性能的影响。我们在基于物理的仿真环境（包括静态和动态障碍物）中训练和测试我们的方法学上的两个不同机器人。我们使用最先进方法的其他常规数据结构对我们的占用表示进行了基准测试。通过动态环境中的物理机器人，训练有素的导航策略也得到了成功的验证。结果表明，与其他占用表示相比，我们的方法不仅减少了所需的训练时间，还可以提高导航性能。我们的工作和所有相关信息的开源实现可在\ url {https://github.com/river-lab/tentabot}上获得。

This paper presents a Deep Reinforcement Learning based navigation approach in which we define the occupancy observations as heuristic evaluations of motion primitives, rather than using raw sensor data. Our method enables fast mapping of the occupancy data, generated by multi-sensor fusion, into trajectory values in 3D workspace. The computationally efficient trajectory evaluation allows dense sampling of the action space. We utilize our occupancy observations in different data structures to analyze their effects on both training process and navigation performance. We train and test our methodology on two different robots within challenging physics-based simulation environments including static and dynamic obstacles. We benchmark our occupancy representations with other conventional data structures from state-of-the-art methods. The trained navigation policies are also validated successfully with physical robots in dynamic environments. The results show that our method not only decreases the required training time but also improves the navigation performance as compared to other occupancy representations. The open-source implementation of our work and all related info are available at \url{https://github.com/RIVeR-Lab/tentabot}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题