用于基于可见性的持续监控的多代理增强学习

论文标题

用于基于可见性的持续监控的多代理增强学习

Multi-Agent Reinforcement Learning for Visibility-based Persistent Monitoring

论文作者

Chen, Jingxi, Baskaran, Amrish, Zhang, Zhongshun, Tokekar, Pratap

论文摘要

基于可见性的持续监视（VPM）问题试图找到一组轨迹（或控制器），以供机器人持续监视不断变化的环境。每个机器人都有一个传感器，例如相机，其视野有限的视野受到环境障碍的阻碍。机器人可能需要相互协调，以确保长时间的环境中没有任何意义。我们对问题进行建模，以便有惩罚可以在每个时间步骤中付出任何时间步骤。但是，惩罚的动力对我们来说是未知的。我们为VPM问题提出了多代理增强学习（MARL）算法。具体而言，我们提出了一个多代理图的近端策略优化（MA-G-PPO）算法，该算法将所有代理的本地观察结果与低分辨率全球映射相结合，以学习每个代理的策略。图表的注意力使代理商可以与其他人共享信息，从而实现有效的联合政策。我们的主要重点是了解MAL在VPM问题上的有效性。我们以这个更广泛的目标调查了五个研究问题。我们发现，MA-G-PPO能够学习比非RL基线更好的政策，在大多数情况下，有效性取决于彼此共享信息的代理人，而所学的政策表明了代理人的新兴行为。

The Visibility-based Persistent Monitoring (VPM) problem seeks to find a set of trajectories (or controllers) for robots to persistently monitor a changing environment. Each robot has a sensor, such as a camera, with a limited field-of-view that is obstructed by obstacles in the environment. The robots may need to coordinate with each other to ensure no point in the environment is left unmonitored for long periods of time. We model the problem such that there is a penalty that accrues every time step if a point is left unmonitored. However, the dynamics of the penalty are unknown to us. We present a Multi-Agent Reinforcement Learning (MARL) algorithm for the VPM problem. Specifically, we present a Multi-Agent Graph Attention Proximal Policy Optimization (MA-G-PPO) algorithm that takes as input the local observations of all agents combined with a low resolution global map to learn a policy for each agent. The graph attention allows agents to share their information with others leading to an effective joint policy. Our main focus is to understand how effective MARL is for the VPM problem. We investigate five research questions with this broader goal. We find that MA-G-PPO is able to learn a better policy than the non-RL baseline in most cases, the effectiveness depends on agents sharing information with each other, and the policy learnt shows emergent behavior for the agents.

下载PDF全文

下载文献需遵守相关版权规定

论文标题