论文标题
基于群集的样本在事后经验重播机器人任务(学生摘要)
Cluster-based Sampling in Hindsight Experience Replay for Robotic Tasks (Student Abstract)
论文作者
论文摘要
在稀疏的二元奖励的多进球强化学习中,由于缺乏成功的经验,培训代理人尤其具有挑战性。为了解决这个问题,事后观察经验重播(她)甚至从失败的经验中产生了成功的经历。但是,从统一采样的经验中产生成功的经验并不是一个有效的过程。在本文中,研究了利用实现目标在产生成功体验的财产的影响,并提出了一种新颖的基于集群的抽样策略。提出的采样策略组通过使用群集模型和样本体验以她创建培训批次的方式,具有不同实现的目标的情节。该方法通过使用OpenAI健身房的三个机器人控制任务进行实验来验证。实验的结果表明,所提出的方法基本上是样本有效的,并且比基线方法更好。
In multi-goal reinforcement learning with a sparse binary reward, training agents is particularly challenging, due to a lack of successful experiences. To solve this problem, hindsight experience replay (HER) generates successful experiences even from unsuccessful ones. However, generating successful experiences from uniformly sampled ones is not an efficient process. In this paper, the impact of exploiting the property of achieved goals in generating successful experiences is investigated and a novel cluster-based sampling strategy is proposed. The proposed sampling strategy groups episodes with different achieved goals by using a cluster model and samples experiences in the manner of HER to create the training batch. The proposed method is validated by experiments with three robotic control tasks of the OpenAI Gym. The results of experiments demonstrate that the proposed method is substantially sample efficient and achieves better performance than baseline approaches.