互动教学的分解感知和政策

论文标题

互动教学的分解感知和政策

Factorizing Perception and Policy for Interactive Instruction Following

论文作者

Singh, Kunal Pratap, Bhambri, Suvaansh, Kim, Byeonghwi, Mottaghi, Roozbeh, Choi, Jonghyun

论文摘要

基于语言指令执行简单的家庭任务对人类来说是非常自然的，但对于AI代理人来说，这仍然是一个悬而未决的挑战。 “以下交互式指示”任务试图在每个步骤中在环境中共同导航，互动和理性的建筑代理方面取得进展。为了解决多方面的问题，我们提出了一个模型，将任务分配到具有增强组件的交互式感知和动作策略流中，并将其命名为MOCA，MOCA是一种以模块化对象的方式。我们从经验上验证了MOCA在Alfred基准测试基准上以明显的边缘优于先前的艺术，并改善了概括。

Performing simple household tasks based on language directives is very natural to humans, yet it remains an open challenge for AI agents. The 'interactive instruction following' task attempts to make progress towards building agents that jointly navigate, interact, and reason in the environment at every step. To address the multifaceted problem, we propose a model that factorizes the task into interactive perception and action policy streams with enhanced components and name it as MOCA, a Modular Object-Centric Approach. We empirically validate that MOCA outperforms prior arts by significant margins on the ALFRED benchmark with improved generalization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题