原型在计算加强学习中特定好奇心的三个关键特性

论文标题

原型在计算加强学习中特定好奇心的三个关键特性

Prototyping three key properties of specific curiosity in computational reinforcement learning

论文作者

Ady, Nadia M., Shariff, Roshan, Günther, Johannes, Pilarski, Patrick M.

论文摘要

对机器代理商的好奇心一直是激烈研究的重点。对人类和动物的好奇心的研究，尤其是特定的好奇心，已经发掘出了几种特性，这些特性将为机器学习者带来重要的好处，但在机器智能中尚未得到充分探索。在这项工作中，我们介绍了这些属性中最直接的三个 - 定向，满足时停止以及自愿暴露 - 并展示如何在概念验证证明的增强剂学习代理中一起实施；此外，我们演示了该特性如何在该试剂的行为中表现出来，其中包括诱发好奇心的位置和好奇心目标的简单非差异环境环境。正如我们希望的那样，代理商表现出短期定向行为，同时更新长期偏好以适应性地寻求引起好奇的情况。因此，这项工作对特定好奇心的运作方式以及将来可能融入了寻求目标，决策代理在复杂环境中的行为。

Curiosity for machine agents has been a focus of intense research. The study of human and animal curiosity, particularly specific curiosity, has unearthed several properties that would offer important benefits for machine learners, but that have not yet been well-explored in machine intelligence. In this work, we introduce three of the most immediate of these properties -- directedness, cessation when satisfied, and voluntary exposure -- and show how they may be implemented together in a proof-of-concept reinforcement learning agent; further, we demonstrate how the properties manifest in the behaviour of this agent in a simple non-episodic grid-world environment that includes curiosity-inducing locations and induced targets of curiosity. As we would hope, the agent exhibits short-term directed behaviour while updating long-term preferences to adaptively seek out curiosity-inducing situations. This work therefore presents a novel view into how specific curiosity operates and in the future might be integrated into the behaviour of goal-seeking, decision-making agents in complex environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题