论文标题

实时计划的样品效率跨凝结法

Sample-efficient Cross-Entropy Method for Real-time Planning

论文作者

Pinneri, Cristina, Sawant, Shambhuraj, Blaes, Sebastian, Achterhold, Jan, Stueckler, Joerg, Rolinek, Michal, Martius, Georg

论文摘要

用于基于模型的增强学习的轨迹优化器,例如跨凝结方法(CEM),即使在高维控制任务和稀疏回报环境中,也可以产生令人信服的结果。但是,他们的抽样效率低下会阻止它们用于实时计划和控制。我们提出了用于快速计划的CEM算法的改进版本,其中包括时间相关的动作和记忆,需要减少2.7-22倍的样本,并在高维控制问题中产生1.2-10x的性能提高。

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源