具有动态参与代理商的可转移的多代理增强学习

论文标题

具有动态参与代理商的可转移的多代理增强学习

Transferable Multi-Agent Reinforcement Learning with Dynamic Participating Agents

论文作者

Tang, Xuting, Xu, Jia, Wang, Shusen

论文摘要

我们通过集中式培训和分散执行研究多代理增强学习（MARL）。在培训期间，新代理商可能会加入，现有代理商可能会意外离开培训。在这种情况下，必须再次从头开始训练标准的深色MAL模型，这非常耗时。为了解决这个问题，我们提出了一种特殊的网络体系结构，其中包括一些弹出的学习算法，该算法允许在集中式培训期间代理的数量变化。特别是，当一个新的代理加入集中式培训时，我们的几次学习算法使用少量样本训练其政策网络和价值网络；当代理离开训练时，其余代理的训练过程不会受到影响。我们的实验表明，使用提出的网络体系结构和算法，当新代理连接可能比基线快100倍以上时模型适应。我们的工作适用于任何环境，包括合作，竞争和混合。

We study multi-agent reinforcement learning (MARL) with centralized training and decentralized execution. During the training, new agents may join, and existing agents may unexpectedly leave the training. In such situations, a standard deep MARL model must be trained again from scratch, which is very time-consuming. To tackle this problem, we propose a special network architecture with a few-shot learning algorithm that allows the number of agents to vary during centralized training. In particular, when a new agent joins the centralized training, our few-shot learning algorithm trains its policy network and value network using a small number of samples; when an agent leaves the training, the training process of the remaining agents is not affected. Our experiments show that using the proposed network architecture and algorithm, model adaptation when new agents join can be 100+ times faster than the baseline. Our work is applicable to any setting, including cooperative, competitive, and mixed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题