在视频游戏上基准端到端的行为克隆

论文标题

在视频游戏上基准端到端的行为克隆

Benchmarking End-to-End Behavioural Cloning on Video Games

论文作者

Kanervisto, Anssi, Pussinen, Joonas, Hautamäki, Ville

论文摘要

通过有或没有加强学习，教导计算机根据演示执行任务的行为克隆，已成功地应用于各种视频游戏和机器人任务。这还包括端到端的方法，计算机像人类一样播放视频游戏：通过查看屏幕上显示的图像并将击键发送到游戏中。作为玩视频游戏的一般方法，这具有许多引人入胜的属性：无需对游戏进行专门修改，没有冗长的训练课程以及在不同游戏中重新使用相同工具的能力。但是，相关工作包括特定于游戏的工程以实现结果。我们迈出了一种通用方法，并通过将人类示范作为培训数据来研究行为克隆在十二个视频游戏中的一般适用性，包括六个现代视频游戏（在2010年之后出版）。我们的结果表明，这些代理在原始性能中无法与人类相匹配，但要学习基本的动态和规则。我们还展示了数据的质量如何重要，以及由于人类反射而导致的人类记录数据如何遭受州行动不匹配。

Behavioural cloning, where a computer is taught to perform a task based on demonstrations, has been successfully applied to various video games and robotics tasks, with and without reinforcement learning. This also includes end-to-end approaches, where a computer plays a video game like humans do: by looking at the image displayed on the screen, and sending keystrokes to the game. As a general approach to playing video games, this has many inviting properties: no need for specialized modifications to the game, no lengthy training sessions and the ability to re-use the same tools across different games. However, related work includes game-specific engineering to achieve the results. We take a step towards a general approach and study the general applicability of behavioural cloning on twelve video games, including six modern video games (published after 2010), by using human demonstrations as training data. Our results show that these agents cannot match humans in raw performance but do learn basic dynamics and rules. We also demonstrate how the quality of the data matters, and how recording data from humans is subject to a state-action mismatch, due to human reflexes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题