安全控制政策的神经证书

论文标题

安全控制政策的神经证书

Neural Certificates for Safe Control Policies

论文作者

Jin, Wanxin, Wang, Zhaoran, Yang, Zhuoran, Mou, Shaoshuai

论文摘要

本文开发了一种学习动态系统政策的方法，该政策被证明是安全的，而且要实现目标。在这里，安全性意味着政策不得将系统状态推向任何不安全的区域，而目标范围内则需要对受控系统的轨迹渐近地收敛到目标区域（稳定性的概括）。我们通过共同学习两个附加证书功能来获得安全和实现目标的政策：一种障碍功能，可确保安全性和开发的类似Lyapunov的功能，以满足实现目标的要求，这两种功能都由神经网络代表。我们展示了该方法在包括摆，卡车和无人机在内的各种系统上同时学习安全和目标政策的有效性。

This paper develops an approach to learn a policy of a dynamical system that is guaranteed to be both provably safe and goal-reaching. Here, the safety means that a policy must not drive the state of the system to any unsafe region, while the goal-reaching requires the trajectory of the controlled system asymptotically converges to a goal region (a generalization of stability). We obtain the safe and goal-reaching policy by jointly learning two additional certificate functions: a barrier function that guarantees the safety and a developed Lyapunov-like function to fulfill the goal-reaching requirement, both of which are represented by neural networks. We show the effectiveness of the method to learn both safe and goal-reaching policies on various systems, including pendulums, cart-poles, and UAVs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题