Prima：多任务推理代理中的计划者 - 宣传员

论文标题

Prima：多任务推理代理中的计划者 - 宣传员

PRIMA: Planner-Reasoner Inside a Multi-task Reasoning Agent

论文作者

Lyu, Daoming, Liu, Bo, Chen, Jianshu

论文摘要

我们考虑了多任务推理（MTR）的问题，在该问题中，代理可以通过（一阶）逻辑推理解决多个任务。由于其强大的普遍性和对处理多个任务的简单性，这种能力对于人类智能至关重要。但是，开发有效MTR的主要挑战是推理能力和效率之间的内在冲突。具有MTR的代理商必须掌握大量的“技能”来处理各种任务，但是在推理阶段执行特定任务只需要一小部分立即相关的技能。我们如何保持广泛的推理能力以及有效的特定任务绩效？为了解决这个问题，我们提出了一个能够最先进的MTR功能和高效率的计划者 - 理想者框架。 Reasoner模型可共享（一阶）逻辑扣除规则，从该规则中，计划者从中选择一个子集来组成有效的推理路径。使用深入的增强学习以端到端的方式对整个模型进行训练，并对各种领域的实验研究验证了其有效性。

We consider the problem of multi-task reasoning (MTR), where an agent can solve multiple tasks via (first-order) logic reasoning. This capability is essential for human-like intelligence due to its strong generalizability and simplicity for handling multiple tasks. However, a major challenge in developing effective MTR is the intrinsic conflict between reasoning capability and efficiency. An MTR-capable agent must master a large set of "skills" to tackle diverse tasks, but executing a particular task at the inference stage requires only a small subset of immediately relevant skills. How can we maintain broad reasoning capability and also efficient specific-task performance? To address this problem, we propose a Planner-Reasoner framework capable of state-of-the-art MTR capability and high efficiency. The Reasoner models shareable (first-order) logic deduction rules, from which the Planner selects a subset to compose into efficient reasoning paths. The entire model is trained in an end-to-end manner using deep reinforcement learning, and experimental studies over a variety of domains validate its effectiveness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题