论文标题
从模拟中学习,现实中的赛车
Learning from Simulation, Racing in Reality
论文作者
论文摘要
我们提出了一种基于强化的学习解决方案,可以在微型赛车平台上自主竞争。我们表明,只能成功地将相对简单的车辆模型(包括模型随机化)训练的策略可以成功传输到真正的机器人设置中。我们通过使用新颖的政策输出正规化方法和提升的动作空间来实现这一目标,从而可以平稳动作,但仍然具有攻击性的赛车驾驶。我们表明,在模拟和真实汽车上,这种正则化政策确实超过了软演员评论家(SAC)基线方法,但它仍然胜过模型预测控制器(MPC)的最先进方法。通过三个小时的现实交互数据对策略的改进使强化学习政策可以实现与MPC控制器相似的单圈时间,同时将轨道约束违规措施减少了50%。
We present a reinforcement learning-based solution to autonomously race on a miniature race car platform. We show that a policy that is trained purely in simulation using a relatively simple vehicle model, including model randomization, can be successfully transferred to the real robotic setup. We achieve this by using novel policy output regularization approach and a lifted action space which enables smooth actions but still aggressive race car driving. We show that this regularized policy does outperform the Soft Actor Critic (SAC) baseline method, both in simulation and on the real car, but it is still outperformed by a Model Predictive Controller (MPC) state of the art method. The refinement of the policy with three hours of real-world interaction data allows the reinforcement learning policy to achieve lap times similar to the MPC controller while reducing track constraint violations by 50%.