机器人使用基于模型和无模型的强化学习弹奏Kendama

论文标题

机器人使用基于模型和无模型的强化学习弹奏Kendama

Robot Playing Kendama with Model-Based and Model-Free Reinforcement Learning

论文作者

Li, Shidi

论文摘要

已经为机器人轨迹学习任务提出了几种基于模型和模型的方法。两种方法都有其好处和缺点。他们通常可以互相补充。许多研究工作正在试图将一些基于模型和模型的方法集成到一种算法中，并在模拟器或准静态机器人任务中表现良好。当在特定的轨迹学习任务中使用算法时，仍然存在困难。在本文中，我们提出了一个机器人轨迹学习框架，以使用不连续的动态和高速进行精确任务。从人类示范中学到的轨迹通过DDP和权力连续优化。该框架经过肯达马操纵任务的测试，这对于人类也很难实现。结果表明，我们的方法可以计划轨迹以成功完成任务。

Several model-based and model-free methods have been proposed for the robot trajectory learning task. Both approaches have their benefits and drawbacks. They can usually complement each other. Many research works are trying to integrate some model-based and model-free methods into one algorithm and perform well in simulators or quasi-static robot tasks. Difficulties still exist when algorithms are used in particular trajectory learning tasks. In this paper, we propose a robot trajectory learning framework for precise tasks with discontinuous dynamics and high speed. The trajectories learned from the human demonstration are optimized by DDP and PoWER successively. The framework is tested on the Kendama manipulation task, which can also be difficult for humans to achieve. The results show that our approach can plan the trajectories to successfully complete the task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题