计划不说话：对沟通损失稳健的多基因系统

论文标题

计划不说话：对沟通损失稳健的多基因系统

Planning Not to Talk: Multiagent Systems that are Robust to Communication Loss

论文作者

Karabag, Mustafa O., Neary, Cyrus, Topcu, Ufuk

论文摘要

在合作的多种系统中，一系列代理执行共同政策，以实现一些共同的目标。此类系统的成功部署取决于可靠的跨间通信的可用性。但是，实践中存在许多潜在沟通中断的来源，例如无线电干扰，硬件故障和对抗性攻击。在这项工作中，我们为合作多种系统制定了联合政策，这些政策对沟通的潜在损失非常有力。更具体地说，我们制定了具有避免目标的合作马尔可夫游戏的联合政策。首先，我们提出了一种在通讯损失期间分散执行联合政策的算法。接下来，我们使用联合政策引起的状态行动过程的总相关性，以衡量代理之间的内在依赖性。然后，我们使用此措施来降低沟通丢失时联合政策的绩效。最后，我们提出了一种算法，该算法最大化了该下限的代理，以便合成对通信损失稳健的最小依赖性关节策略。数值实验表明，所提出的最小依赖性策略需要在代理之间最小的协调，而绩效几乎没有损失。综合策略的总相关值是基线策略总相关价值的五分之一，该价值不考虑潜在的通信损失。结果，无论是否可用，最低依赖性策略的性能仍然一致地保持很高。相比之下，当交流丢失时，基线政策的绩效下降了20％。

In a cooperative multiagent system, a collection of agents executes a joint policy in order to achieve some common objective. The successful deployment of such systems hinges on the availability of reliable inter-agent communication. However, many sources of potential disruption to communication exist in practice, such as radio interference, hardware failure, and adversarial attacks. In this work, we develop joint policies for cooperative multiagent systems that are robust to potential losses in communication. More specifically, we develop joint policies for cooperative Markov games with reach-avoid objectives. First, we propose an algorithm for the decentralized execution of joint policies during periods of communication loss. Next, we use the total correlation of the state-action process induced by a joint policy as a measure of the intrinsic dependencies between the agents. We then use this measure to lower-bound the performance of a joint policy when communication is lost. Finally, we present an algorithm that maximizes a proxy to this lower bound in order to synthesize minimum-dependency joint policies that are robust to communication loss. Numerical experiments show that the proposed minimum-dependency policies require minimal coordination between the agents while incurring little to no loss in performance; the total correlation value of the synthesized policy is one fifth of the total correlation value of the baseline policy which does not take potential communication losses into account. As a result, the performance of the minimum-dependency policies remains consistently high regardless of whether or not communication is available. By contrast, the performance of the baseline policy decreases by twenty percent when communication is lost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题