剩余的机器人学习以对象为中心的概率运动原语

论文标题

剩余的机器人学习以对象为中心的概率运动原语

Residual Robot Learning for Object-Centric Probabilistic Movement Primitives

论文作者

Carvalho, Joao, Koert, Dorothea, Daniv, Marek, Peters, Jan

论文摘要

希望将来的机器人快速学习新任务，并将学习技能调整到不断变化的环境中。为此，概率运动原语（PROMP）已证明是一个有前途的框架，可以从分布式上从分布上学习可证明的轨迹的轨迹发生器。但是，在需要高精度的实际应用中，在对象操纵物体中的准确性通常是不够的，尤其是当它们从外部观察中学习在笛卡尔空间中并以有限的控制器增益而执行时。因此，我们建议将宣传与最近引入的剩余增强学习（RRL）相结合，以说明任务执行期间的位置和方向的校正。特别是，我们在名义上的Propp轨迹上学习了一个与软性评论家的残留物，并将示范中的可变性纳入了降低RRL搜索空间的决策变量。作为概念的证明，我们使用7-DOF Franka Emika Panda机器人评估了我们提出的方法在3D块插入任务上。实验结果表明，机器人成功地学习完成了使用基本启动之前无法实现的插入。

It is desirable for future robots to quickly learn new tasks and adapt learned skills to constantly changing environments. To this end, Probabilistic Movement Primitives (ProMPs) have shown to be a promising framework to learn generalizable trajectory generators from distributions over demonstrated trajectories. However, in practical applications that require high precision in the manipulation of objects, the accuracy of ProMPs is often insufficient, in particular when they are learned in cartesian space from external observations and executed with limited controller gains. Therefore, we propose to combine ProMPs with recently introduced Residual Reinforcement Learning (RRL), to account for both, corrections in position and orientation during task execution. In particular, we learn a residual on top of a nominal ProMP trajectory with Soft-Actor Critic and incorporate the variability in the demonstrations as a decision variable to reduce the search space for RRL. As a proof of concept, we evaluate our proposed method on a 3D block insertion task with a 7-DoF Franka Emika Panda robot. Experimental results show that the robot successfully learns to complete the insertion which was not possible before with using basic ProMPs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题