学习任务嵌入多代理强化学习中的团队合作适应

论文标题

学习任务嵌入多代理强化学习中的团队合作适应

Learning Task Embeddings for Teamwork Adaptation in Multi-Agent Reinforcement Learning

论文作者

Schäfer, Lukas, Christianos, Filippos, Storkey, Amos, Albrecht, Stefano V.

论文摘要

成功部署多代理强化学习通常需要代理来适应其行为。在这项工作中，我们讨论了团队合作适应的问题，其中一组代理商需要调整其政策以通过有限的微调解决新任务。由代理人需要能够识别和区分任务以使其行为适应当前任务的直觉的动机，我们建议学习多代理任务嵌入（MATE）。这些任务嵌入方式是使用针对重建过渡和奖励功能进行优化的编码器架构训练的，这些功能唯一地识别任务。我们表明，在提供任务嵌入时，一组代理商可以适应新颖的任务。我们提出了三个伴侣训练范式：独立伴侣，集中式伴侣和混合伴侣，这些伴侣在任务编码的信息中有所不同。我们表明，伴侣学到的嵌入识别任务，并提供有用的信息，哪些代理在适应新任务期间利用的嵌入。

Successful deployment of multi-agent reinforcement learning often requires agents to adapt their behaviour. In this work, we discuss the problem of teamwork adaptation in which a team of agents needs to adapt their policies to solve novel tasks with limited fine-tuning. Motivated by the intuition that agents need to be able to identify and distinguish tasks in order to adapt their behaviour to the current task, we propose to learn multi-agent task embeddings (MATE). These task embeddings are trained using an encoder-decoder architecture optimised for reconstruction of the transition and reward functions which uniquely identify tasks. We show that a team of agents is able to adapt to novel tasks when provided with task embeddings. We propose three MATE training paradigms: independent MATE, centralised MATE, and mixed MATE which vary in the information used for the task encoding. We show that the embeddings learned by MATE identify tasks and provide useful information which agents leverage during adaptation to novel tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题