CPG-RL：学习四足球的中央模式生成器

论文标题

CPG-RL：学习四足球的中央模式生成器

CPG-RL: Learning Central Pattern Generators for Quadruped Locomotion

论文作者

Bellegarda, Guillaume, Ijspeert, Auke

论文摘要

在这封信中，我们提出了一种集成中央模式发生器（CPG）的方法，即耦合振荡器的系统，进入深钢筋学习（DRL）框架，以产生强大而全向的四边形运动。代理学会直接调节固有的振荡器设定点（振幅和频率），并在不同振荡器之间进行节奏行为。这种方法还允许使用DRL探索与神经科学有关的问题，即降序途径，互媒体耦合和步态产生中感觉反馈的作用。我们在模拟中训练策略，并向单位A1进行SIM到现实的转移，在训练过程中，我们观察到可靠的行为，以使其在训练期间看不见，最值得注意的是，动态增加了13.75 kg载荷，代表了标称四倍体质量的115％。我们基于本体感知感应测试了几个不同的观察空间，并表明我们的框架是可以部署的，而没有域随机化，几乎没有反馈的反馈，在该框架中，与振荡器状态一起，可以在观察空间中仅提供触点布尔值。视频结果可以在https://youtu.be/xqxhlzlsev4上找到。

In this letter, we present a method for integrating central pattern generators (CPGs), i.e. systems of coupled oscillators, into the deep reinforcement learning (DRL) framework to produce robust and omnidirectional quadruped locomotion. The agent learns to directly modulate the intrinsic oscillator setpoints (amplitude and frequency) and coordinate rhythmic behavior among different oscillators. This approach also allows the use of DRL to explore questions related to neuroscience, namely the role of descending pathways, interoscillator couplings, and sensory feedback in gait generation. We train our policies in simulation and perform a sim-to-real transfer to the Unitree A1 quadruped, where we observe robust behavior to disturbances unseen during training, most notably to a dynamically added 13.75 kg load representing 115% of the nominal quadruped mass. We test several different observation spaces based on proprioceptive sensing and show that our framework is deployable with no domain randomization and very little feedback, where along with the oscillator states, it is possible to provide only contact booleans in the observation space. Video results can be found at https://youtu.be/xqXHLzLsEV4.

下载PDF全文

下载文献需遵守相关版权规定

论文标题