EGOMAP：Deep RL的投影映射和结构化的Egintric记忆

论文标题

EGOMAP：Deep RL的投影映射和结构化的Egintric记忆

EgoMap: Projective mapping and structured egocentric memory for Deep RL

论文作者

Beeching, Edward, Wolf, Christian, Dibangoye, Jilles, Simonin, Olivier

论文摘要

涉及本地化，记忆和规划的任务在部分可观察到的3D环境中是深度强化学习的持续挑战。我们提出EGOMAP，这是一种空间结构的神经记忆结构。 EGOMAP在3D环境中增强了深厚的强化学习代理在具有多步骤目标的挑战性任务上的表现。 EGOMAP结构结合了几种电感偏差，包括CNN特征向量的可区分反向投影到自上而下的空间结构映射上。通过可区分的仿射变换，通过自我运动测量更新地图。我们显示，这种体系结构的表现优于标准的复发代理和具有结构性内存的艺术代理的状态。我们证明，将这些归纳偏见纳入代理人的体系结构中可以单独进行奖励进行稳定的培训，从而规避获取和标记专家轨迹的费用。一项详细的消融研究证明了体系结构的关键方面以及通过广泛的定性分析的影响，我们展示了代理如何利用其结构化的内部记忆以实现更高的性能。

Tasks involving localization, memorization and planning in partially observable 3D environments are an ongoing challenge in Deep Reinforcement Learning. We present EgoMap, a spatially structured neural memory architecture. EgoMap augments a deep reinforcement learning agent's performance in 3D environments on challenging tasks with multi-step objectives. The EgoMap architecture incorporates several inductive biases including a differentiable inverse projection of CNN feature vectors onto a top-down spatially structured map. The map is updated with ego-motion measurements through a differentiable affine transform. We show this architecture outperforms both standard recurrent agents and state of the art agents with structured memory. We demonstrate that incorporating these inductive biases into an agent's architecture allows for stable training with reward alone, circumventing the expense of acquiring and labelling expert trajectories. A detailed ablation study demonstrates the impact of key aspects of the architecture and through extensive qualitative analysis, we show how the agent exploits its structured internal memory to achieve higher performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题