论文标题
RL中的深度集合
Deep Sets for Generalization in RL
论文作者
论文摘要
本文研究了在奖励功能和语言引导的强化学习者的奖励功能和政策架构设计中编码以对象为中心的表示的想法。这是使用从深度集和封闭式注意机制启发的对象置换不变网络的组合完成的。在一个2D程序生成的世界中,针对自然语言目标的代理人导航并与对象进行交互,我们表明这些体系结构表现出强大的泛化能力,可以实现分布的目标。我们在测试时研究对对象数量的不同数量的概括,并进一步将以对象为中心的体系结构扩展到涉及关系推理的目标。
This paper investigates the idea of encoding object-centered representations in the design of the reward function and policy architectures of a language-guided reinforcement learning agent. This is done using a combination of object-wise permutation invariant networks inspired from Deep Sets and gated-attention mechanisms. In a 2D procedurally-generated world where agents targeting goals in natural language navigate and interact with objects, we show that these architectures demonstrate strong generalization capacities to out-of-distribution goals. We study the generalization to varying numbers of objects at test time and further extend the object-centered architectures to goals involving relational reasoning.