使用遗传算法的自动参数优化在深度加强学习中，用于机器人操纵任务

论文标题

使用遗传算法的自动参数优化在深度加强学习中，用于机器人操纵任务

Automatic Parameter Optimization Using Genetic Algorithm in Deep Reinforcement Learning for Robotic Manipulation Tasks

论文作者

Sehgal, Adarsh, Ward, Nicholas, La, Hung, Louis, Sushil

论文摘要

学习代理可以利用强化学习（RL）通过使用奖励功能来决定其行动。但是，学习过程受到学习算法中使用的超参数价值的选举的很大影响。这项工作提出了深层确定性的策略梯度（DDPG）和基于事后的经验重播（她）的方法，该方法利用遗传算法（GA）来微调超参数的值。此方法（GA+DDPG+HER）在六个机器人操纵任务上进行了实验：fetchreach；捕捞滑坡；提取;取取派克；门开;和Auboreach。对这些结果的分析表明，表现显着增加和学习时间的减少。此外，我们比较并提供证据表明GA+DDPG+她比现有方法更好。

Learning agents can make use of Reinforcement Learning (RL) to decide their actions by using a reward function. However, the learning process is greatly influenced by the elect of values of the hyperparameters used in the learning algorithm. This work proposed a Deep Deterministic Policy Gradient (DDPG) and Hindsight Experience Replay (HER) based method, which makes use of the Genetic Algorithm (GA) to fine-tune the hyperparameters' values. This method (GA+DDPG+HER) experimented on six robotic manipulation tasks: FetchReach; FetchSlide; FetchPush; FetchPickAndPlace; DoorOpening; and AuboReach. Analysis of these results demonstrated a significant increase in performance and a decrease in learning time. Also, we compare and provide evidence that GA+DDPG+HER is better than the existing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题