奖励理性（隐性）选择：一种统一的形式主义，用于奖励学习

论文标题

奖励理性（隐性）选择：一种统一的形式主义，用于奖励学习

Reward-rational (implicit) choice: A unifying formalism for reward learning

论文作者

Jeon, Hong Jun, Milli, Smitha, Dragan, Anca D.

论文摘要

通常很难将正确的奖励功能指定为任务的正确奖励功能，因此研究人员旨在从人类行为或反馈中学习奖励功能。近年来，被解释为奖励功能的证据的行为类型已大大扩展。我们已经从示范到比较，再到人类将机器人推开或将其关闭时泄漏的信息。当然，还有更多。机器人将如何了解所有这些不同类型的行为？我们的主要见解是，可以用单个统一的形式主义来解释不同类型的行为 - 作为人类正在做出的奖励有理选择，通常是隐含的。形式主义既提供了一个统一的镜头，又可以看到过去的工作，也提供了解释尚未发现的新信息来源的秘诀。我们提供了两个示例来展示这一点：解释新的反馈类型，并阅读反馈本身如何泄漏有关奖励的信息。

It is often difficult to hand-specify what the correct reward function is for a task, so researchers have instead aimed to learn reward functions from human behavior or feedback. The types of behavior interpreted as evidence of the reward function have expanded greatly in recent years. We've gone from demonstrations, to comparisons, to reading into the information leaked when the human is pushing the robot away or turning it off. And surely, there is more to come. How will a robot make sense of all these diverse types of behavior? Our key insight is that different types of behavior can be interpreted in a single unifying formalism - as a reward-rational choice that the human is making, often implicitly. The formalism offers both a unifying lens with which to view past work, as well as a recipe for interpreting new sources of information that are yet to be uncovered. We provide two examples to showcase this: interpreting a new feedback type, and reading into how the choice of feedback itself leaks information about the reward.

下载PDF全文

下载文献需遵守相关版权规定

论文标题