为具有一般目标的任务组成现实世界的人手合作

论文标题

为具有一般目标的任务组成现实世界的人手合作

Forming Real-World Human-Robot Cooperation for Tasks With General Goal

论文作者

Tao, Lingfeng, Bowman, Michael, Zhang, Jiucai, Zhang, Xiaoli

论文摘要

在人类机器人合作中，机器人与人类合作，共同完成任务。现有的方法假设人类在合作期间具有特定的目标，机器人侵入并行动于此。但是，在现实环境中，人类通常只有在合作开始时的一般目标（例如，运动计划中的一般方向或方向或领域），在合作期间，需要将其澄清到特定目标（即确切的位置）。规范过程是交互式和动态的，这取决于环境和伴侣的行为。不考虑目标规范过程的机器人可能会使人伴侣感到沮丧，从而延长达成协议的时间，并妥协团队绩效。这项工作介绍了进化价值学习方法，以模拟目标规范过程的动力学，并使用基于州的多元贝叶斯推理和与目标特异性相关的特征。该模型使机器人能够积极地增强人类目标规范的过程，并以深厚的强化学习方式找到合作政策。我们的方法在具有更快的目标规范过程中优于现有方法，并在与真实人类主题的动态球平衡任务中更好地表现团队表现。

In human-robot cooperation, the robot cooperates with humans to accomplish the task together. Existing approaches assume the human has a specific goal during the cooperation, and the robot infers and acts toward it. However, in real-world environments, a human usually only has a general goal (e.g., general direction or area in motion planning) at the beginning of the cooperation, which needs to be clarified to a specific goal (i.e., an exact position) during cooperation. The specification process is interactive and dynamic, which depends on the environment and the partner's behavior. The robot that does not consider the goal specification process may cause frustration to the human partner, elongate the time to come to an agreement, and compromise team performance. This work presents the Evolutionary Value Learning approach to model the dynamics of the goal specification process with State-based Multivariate Bayesian Inference and goal specificity-related features. This model enables the robot to enhance the process of the human's goal specification actively and find a cooperative policy in a Deep Reinforcement Learning manner. Our method outperforms existing methods with faster goal specification processes and better team performance in a dynamic ball balancing task with real human subjects.

下载PDF全文

下载文献需遵守相关版权规定

论文标题