蒙特卡洛机器人路径计划

论文标题

蒙特卡洛机器人路径计划

Monte-Carlo Robot Path Planning

论文作者

Dam, T., Chalvatzaki, G., Peters, J., Pajarinen, J.

论文摘要

路径计划是设计机器人行为的关键算法方法。基于抽样的方法，例如快速探索随机树（RRT）或概率路线图，是针对路径计划问题的突出算法解决方案。尽管RRT的指数收敛速率，但只能找到次优路径。另一方面，$ \ textrm {rrt}^*$，一种广泛使用的RRT扩展名，保证了寻找最佳路径的概率完整性，但在复杂环境中的缓慢收敛中遭受了实践的影响。此外，现实世界中的机器人环境通常是可观察到的，或者描述的动力学不足，施放了$ \ textrm {rrt}^*$在复杂任务中的应用。本文研究了一种用于机器人路径计划的流行蒙特卡洛树搜索（MCTS）算法的新型算法公式。值得注意的是，我们通过分析和证明其指数的收敛速率（MCPP）在完全可观察到的马尔可夫决策过程（MDP）的一部分中，并证明其指数的收敛速率（MDPS），另一部分证明了其概率的完整性，可在可见的MDP（POMDPS）中发现可行的途径（POMDPS）限制性距离（限制性观察）（假定限制性观察）。我们的算法贡献使我们能够采用最近提出的MCT的变体，并具有不同的勘探策略来进行机器人路径计划。我们在模拟的2D和3D环境中具有7度自由度（DOF）操纵器以及现实世界机器人路径计划任务中的实验评估，证明了MCPP在POMDP任务中的优势。

Path planning is a crucial algorithmic approach for designing robot behaviors. Sampling-based approaches, like rapidly exploring random trees (RRTs) or probabilistic roadmaps, are prominent algorithmic solutions for path planning problems. Despite its exponential convergence rate, RRT can only find suboptimal paths. On the other hand, $\textrm{RRT}^*$, a widely-used extension to RRT, guarantees probabilistic completeness for finding optimal paths but suffers in practice from slow convergence in complex environments. Furthermore, real-world robotic environments are often partially observable or with poorly described dynamics, casting the application of $\textrm{RRT}^*$ in complex tasks suboptimal. This paper studies a novel algorithmic formulation of the popular Monte-Carlo tree search (MCTS) algorithm for robot path planning. Notably, we study Monte-Carlo Path Planning (MCPP) by analyzing and proving, on the one part, its exponential convergence rate to the optimal path in fully observable Markov decision processes (MDPs), and on the other part, its probabilistic completeness for finding feasible paths in partially observable MDPs (POMDPs) assuming limited distance observability (proof sketch). Our algorithmic contribution allows us to employ recently proposed variants of MCTS with different exploration strategies for robot path planning. Our experimental evaluations in simulated 2D and 3D environments with a 7 degrees of freedom (DOF) manipulator, as well as in a real-world robot path planning task, demonstrate the superiority of MCPP in POMDP tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题