论文标题

实用程序之前的可能性:学习和使用层次结构负担

Possibility Before Utility: Learning And Using Hierarchical Affordances

论文作者

Costales, Robby, Iqbal, Shariq, Sha, Fei

论文摘要

强化学习算法在具有复杂层次依赖性结构的任务上挣扎。人类和其他聪明的代理商不会浪费时间来评估存在的每种高级行动的效用,而是首先认为他们认为可能的效用。目前,通过仅关注可行的或“负担”的内容,代理可以花更多的时间评估重要的事情并在重要的事情上行事。为此,我们介绍了层次的负担能力学习(HAL),该方法可以学习层次负担模型,以便修剪不可能的子任务以进行更有效的学习。层次强化学习中的现有作品为代理提供了子任务的结构表示,但并不容易意识到,并且通过将我们对当今状态的分层负担的定义进行扎根,我们的方法比以象征性历史中的子任务依赖性为基础的多种方法更灵活。尽管这些基于逻辑的方法通常需要完全了解子任务层次结构,但我们的方法能够利用不完整和不同的符号规范。此外,我们证明,相对于非应对方法的方法,HAL代理可以更好地学习复杂的任务,驾驶环境随机性,并在没有外部监督的情况下获得多样化的技能 - 所有这些都是人类学习的标志。

Reinforcement learning algorithms struggle on tasks with complex hierarchical dependency structures. Humans and other intelligent agents do not waste time assessing the utility of every high-level action in existence, but instead only consider ones they deem possible in the first place. By focusing only on what is feasible, or "afforded", at the present moment, an agent can spend more time both evaluating the utility of and acting on what matters. To this end, we present Hierarchical Affordance Learning (HAL), a method that learns a model of hierarchical affordances in order to prune impossible subtasks for more effective learning. Existing works in hierarchical reinforcement learning provide agents with structural representations of subtasks but are not affordance-aware, and by grounding our definition of hierarchical affordances in the present state, our approach is more flexible than the multitude of approaches that ground their subtask dependencies in a symbolic history. While these logic-based methods often require complete knowledge of the subtask hierarchy, our approach is able to utilize incomplete and varying symbolic specifications. Furthermore, we demonstrate that relative to non-affordance-aware methods, HAL agents are better able to efficiently learn complex tasks, navigate environment stochasticity, and acquire diverse skills in the absence of extrinsic supervision -- all of which are hallmarks of human learning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源