在有限的带宽约束下，通过沟通进行事件触发的多代理增强学习

论文标题

在有限的带宽约束下，通过沟通进行事件触发的多代理增强学习

Event-Triggered Multi-agent Reinforcement Learning with Communication under Limited-bandwidth Constraint

论文作者

Hu, Guangzheng, Zhu, Yuanheng, Zhao, Dongbin, Zhao, Mengchen, Hao, Jianye

论文摘要

以分布式的方式进行交流和作为一个小组的行为在多代理增强学习中至关重要。但是，现实世界中的多机构系统受到对带宽有限通信的限制。如果带宽完全占据，则某些代理将无法迅速向他人发送消息，从而导致决策延迟并损害合作效果。最近的相关工作已经开始解决这个问题，但仍无法最大程度地减少通信资源的消费。在本文中，我们提出了事件触发的通信网络（ETCNET），以通过仅在必要时发送消息来提高多代理系统的通信效率。根据信息理论，有限的带宽将转化为事件触发策略的惩罚门槛，该策略确定每个步骤中的代理是否会发送消息。然后，将事件触发的策略的设计作为约束的马尔可夫决策问题提出，并加强学习找到满足有限带宽约束的最佳通信协议。关于典型多代理任务的实验表明，ETCNET在减少带宽占用率方面优于其他方法，并且仍然最大程度地保留了多代理系统的合作性能。

Communicating with each other in a distributed manner and behaving as a group are essential in multi-agent reinforcement learning. However, real-world multi-agent systems suffer from restrictions on limited-bandwidth communication. If the bandwidth is fully occupied, some agents are not able to send messages promptly to others, causing decision delay and impairing cooperative effects. Recent related work has started to address the problem but still fails in maximally reducing the consumption of communication resources. In this paper, we propose Event-Triggered Communication Network (ETCNet) to enhance the communication efficiency in multi-agent systems by sending messages only when necessary. According to the information theory, the limited bandwidth is translated to the penalty threshold of an event-triggered strategy, which determines whether an agent at each step sends a message or not. Then the design of the event-triggered strategy is formulated as a constrained Markov decision problem, and reinforcement learning finds the best communication protocol that satisfies the limited bandwidth constraint. Experiments on typical multi-agent tasks demonstrate that ETCNet outperforms other methods in terms of the reduction of bandwidth occupancy and still preserves the cooperative performance of multi-agent systems at the most.

下载PDF全文

下载文献需遵守相关版权规定

论文标题