使用深FBSDES和ADMM分散的安全多代理随机最佳控制

论文标题

使用深FBSDES和ADMM分散的安全多代理随机最佳控制

Decentralized Safe Multi-agent Stochastic Optimal Control using Deep FBSDEs and ADMM

论文作者

Pereira, Marcus A., Saravanos, Augustinos D., So, Oswin, Theodorou, Evangelos A.

论文摘要

在这项工作中，我们提出了一种新型的安全且可扩展的分散解决方案，以在存在随机干扰的情况下进行多代理控制。使用随机控制屏障功能在数学上编码安全性，并通过求解二次程序来计算安全控制。通过增强每个代理的优化变量，复制变量，为其邻居增强，可以实现权力下放。这使我们能够将集中式多代理优化问题解脱出来。但是，为了确保安全，邻近的代理商必须就“我们俩安全的安全”达成共识，这产生了共识。为了实现安全共识解决方案，我们结合了一种基于ADMM的方法。具体而言，我们提出了一个合并的CADMM-OSQP隐式神经网络层，该层求解了同时的局部二次程序和总体共识问题的小批量，作为单个优化问题。该层在每个时间步骤中都嵌入了深FBSDES网络体系结构中，以促进端到端可区分，安全和分散的随机最佳控制。在模拟中的几个具有挑战性的多机器人任务中，证明了该方法的功效。通过对避免碰撞限制指定的安全要求强加要求，可以在整个培训过程中确保所有代理的安全操作。与集中式方法相比，我们还可以在计算和内存节省方面表现出卓越的可伸缩性。

In this work, we propose a novel safe and scalable decentralized solution for multi-agent control in the presence of stochastic disturbances. Safety is mathematically encoded using stochastic control barrier functions and safe controls are computed by solving quadratic programs. Decentralization is achieved by augmenting to each agent's optimization variables, copy variables, for its neighbors. This allows us to decouple the centralized multi-agent optimization problem. However, to ensure safety, neighboring agents must agree on "what is safe for both of us" and this creates a need for consensus. To enable safe consensus solutions, we incorporate an ADMM-based approach. Specifically, we propose a Merged CADMM-OSQP implicit neural network layer, that solves a mini-batch of both, local quadratic programs as well as the overall consensus problem, as a single optimization problem. This layer is embedded within a Deep FBSDEs network architecture at every time step, to facilitate end-to-end differentiable, safe and decentralized stochastic optimal control. The efficacy of the proposed approach is demonstrated on several challenging multi-robot tasks in simulation. By imposing requirements on safety specified by collision avoidance constraints, the safe operation of all agents is ensured during the entire training process. We also demonstrate superior scalability in terms of computational and memory savings as compared to a centralized approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题