地图精灵对确定性模拟中的运动任务的近端政策优化的竞争力

论文标题

地图精灵对确定性模拟中的运动任务的近端政策优化的竞争力

Competitiveness of MAP-Elites against Proximal Policy Optimization on locomotion tasks in deterministic simulations

论文作者

Brych, Szymon, Cully, Antoine

论文摘要

机器人和自动化的重要性越来越多，就可以通过各种方法（例如进化算法（EAS）或增强学习（RL））获得对可学习控制器的需求。不幸的是，这两个算法家族主要是独立发展的，只有少数作品将现代EAS与深度RL算法进行比较。我们表明，现代EA的表型精英（MAP-ELITES）的多维档案可以提供比最先进的RL方法之一，即近端策略优化（PPO），用于机机控制器的生成近端策略优化（PPO）。此外，广泛的高参数调整表明，地图 - 精灵在种子和高参数套件中显示出更大的鲁棒性。通常，本文表明，EAS与现代计算资源相结合表现出有希望的特征，并有可能为控制者学习的最新作品做出贡献。

The increasing importance of robots and automation creates a demand for learnable controllers which can be obtained through various approaches such as Evolutionary Algorithms (EAs) or Reinforcement Learning (RL). Unfortunately, these two families of algorithms have mainly developed independently and there are only a few works comparing modern EAs with deep RL algorithms. We show that Multidimensional Archive of Phenotypic Elites (MAP-Elites), which is a modern EA, can deliver better-performing solutions than one of the state-of-the-art RL methods, Proximal Policy Optimization (PPO) in the generation of locomotion controllers for a simulated hexapod robot. Additionally, extensive hyper-parameter tuning shows that MAP-Elites displays greater robustness across seeds and hyper-parameter sets. Generally, this paper demonstrates that EAs combined with modern computational resources display promising characteristics and have the potential to contribute to the state-of-the-art in controller learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题