在计划的人形机器人的计划脚步上学习双足动物

论文标题

在计划的人形机器人的计划脚步上学习双足动物

Learning Bipedal Walking On Planned Footsteps For Humanoid Robots

论文作者

Singh, Rohan Pratap, Benallegue, Mehdi, Morisawa, Mitsuharu, Cisneros, Rafael, Kanehiro, Fumio

论文摘要

基于腿部机器人的基于深厚的加固学习（RL）控制器表现出令人印象深刻的鲁棒性，可在不同的环境中为多个机器人平台行走。为了在现实世界中启用RL策略为人形机器人应用，至关重要的是，建立一个可以在2D和3D地形上实现任何方向行走的系统，并由用户命令可控制。在本文中，我们通过学习遵循给定步骤序列的政策来解决这个问题。该政策在一组程序生成的步骤序列（也称为脚步计划）的帮助下进行培训。我们表明，仅将即将到来的2个步骤喂入政策就足以实现全向步行，在适当的位置，站立和攀登楼梯。我们的方法采用课程学习对地形的复杂性，并规避了对参考运动或预训练的权重的需求。我们证明了我们提出的方法在Mujoco仿真环境中学习2个新机器人平台的RL策略-HRP5P和JVRC -1-。用于培训和评估的代码可在线获得。

Deep reinforcement learning (RL) based controllers for legged robots have demonstrated impressive robustness for walking in different environments for several robot platforms. To enable the application of RL policies for humanoid robots in real-world settings, it is crucial to build a system that can achieve robust walking in any direction, on 2D and 3D terrains, and be controllable by a user-command. In this paper, we tackle this problem by learning a policy to follow a given step sequence. The policy is trained with the help of a set of procedurally generated step sequences (also called footstep plans). We show that simply feeding the upcoming 2 steps to the policy is sufficient to achieve omnidirectional walking, turning in place, standing, and climbing stairs. Our method employs curriculum learning on the complexity of terrains, and circumvents the need for reference motions or pre-trained weights. We demonstrate the application of our proposed method to learn RL policies for 2 new robot platforms - HRP5P and JVRC-1 - in the MuJoCo simulation environment. The code for training and evaluation is available online.

下载PDF全文

下载文献需遵守相关版权规定

论文标题