敏捷人类行为模仿和扩展运动合成的残留力控制

论文标题

敏捷人类行为模仿和扩展运动合成的残留力控制

Residual Force Control for Agile Human Behavior Imitation and Extended Motion Synthesis

论文作者

Yuan, Ye, Kitani, Kris

论文摘要

强化学习通过从运动捕获数据中学习人形控制策略来综合人类行为的巨大希望。但是，重现诸如芭蕾舞之类的复杂人类技能或稳定模仿复杂过渡的长期人类行为仍然非常具有挑战性。主要困难在于人形模型与真实人类之间的动态不匹配。也就是说，对于人形生物模型，实际人类的运动可能是不可能的。为了克服动力学不匹配，我们提出了一种新颖的方法，残留力控制（RFC），该方法通过将外部残留力添加到动作空间中来增强人形控制政策。在培训期间，基于RFC的政策学会将残留力应用于人形生物，以补偿动态不匹配并更好地模仿参考运动。对各种动态运动的实验表明，我们的方法在收敛速度和学习动作的质量方面优于最先进的方法。值得注意的是，我们展示了由RFC赋予的基于物理的虚拟角色，该角色可以执行高度敏捷的芭蕾舞舞动作，例如Pirouette，Arabesque和Jeté。此外，我们提出了一个双政策控制框架，其中运动策略和基于RFC的策略合作，以合成多模式的无限 - 摩尼斯人类人体动作，而无需任何任务指导或用户输入。我们的方法是第一种人形控制方法，该方法从大规模的人类运动数据集（Human 36M）中成功学习，并产生多样化的长期运动。代码和视频可在https://www.ye-yuan.com/rfc上找到。

Reinforcement learning has shown great promise for synthesizing realistic human behaviors by learning humanoid control policies from motion capture data. However, it is still very challenging to reproduce sophisticated human skills like ballet dance, or to stably imitate long-term human behaviors with complex transitions. The main difficulty lies in the dynamics mismatch between the humanoid model and real humans. That is, motions of real humans may not be physically possible for the humanoid model. To overcome the dynamics mismatch, we propose a novel approach, residual force control (RFC), that augments a humanoid control policy by adding external residual forces into the action space. During training, the RFC-based policy learns to apply residual forces to the humanoid to compensate for the dynamics mismatch and better imitate the reference motion. Experiments on a wide range of dynamic motions demonstrate that our approach outperforms state-of-the-art methods in terms of convergence speed and the quality of learned motions. Notably, we showcase a physics-based virtual character empowered by RFC that can perform highly agile ballet dance moves such as pirouette, arabesque and jeté. Furthermore, we propose a dual-policy control framework, where a kinematic policy and an RFC-based policy work in tandem to synthesize multi-modal infinite-horizon human motions without any task guidance or user input. Our approach is the first humanoid control method that successfully learns from a large-scale human motion dataset (Human3.6M) and generates diverse long-term motions. Code and videos are available at https://www.ye-yuan.com/rfc.

下载PDF全文

下载文献需遵守相关版权规定

论文标题