论文标题
Gran Turismo Sport的超人表现使用深度加固学习
Super-Human Performance in Gran Turismo Sport Using Deep Reinforcement Learning
论文作者
论文摘要
自动赛车是机器人技术的主要挑战。它为经典方法提出了基本问题,例如在不确定的动态下规划最低时间轨迹并以其处理范围控制汽车。此外,将单圈时间最小化的要求(这是一个稀疏的目标),从人类专家那里收集培训数据的困难也阻碍了研究人员直接采用基于学习的方法来解决该问题。在目前的工作中,我们提出了一种基于学习的系统,用于利用高保真的物理汽车模拟,课程过程代理奖励和深入的强化学习,来进行自动赛车。我们将系统部署在Gran Turismo Sport,这是一个世界领先的汽车模拟器,以其对不同赛车和轨道的现实物理模拟而闻名,甚至还用于招募人类赛车手。我们训练有素的政策实现了自主赛车表现,这超出了内置AI到目前为止所取得的成就,同时,超过50,000名人类玩家的数据集中最快的驱动程序。
Autonomous car racing is a major challenge in robotics. It raises fundamental problems for classical approaches such as planning minimum-time trajectories under uncertain dynamics and controlling the car at the limits of its handling. Besides, the requirement of minimizing the lap time, which is a sparse objective, and the difficulty of collecting training data from human experts have also hindered researchers from directly applying learning-based approaches to solve the problem. In the present work, we propose a learning-based system for autonomous car racing by leveraging a high-fidelity physical car simulation, a course-progress proxy reward, and deep reinforcement learning. We deploy our system in Gran Turismo Sport, a world-leading car simulator known for its realistic physics simulation of different race cars and tracks, which is even used to recruit human race car drivers. Our trained policy achieves autonomous racing performance that goes beyond what had been achieved so far by the built-in AI, and, at the same time, outperforms the fastest driver in a dataset of over 50,000 human players.