论文标题
安全控制政策的神经证书
Neural Certificates for Safe Control Policies
论文作者
论文摘要
本文开发了一种学习动态系统政策的方法,该政策被证明是安全的,而且要实现目标。在这里,安全性意味着政策不得将系统状态推向任何不安全的区域,而目标范围内则需要对受控系统的轨迹渐近地收敛到目标区域(稳定性的概括)。我们通过共同学习两个附加证书功能来获得安全和实现目标的政策:一种障碍功能,可确保安全性和开发的类似Lyapunov的功能,以满足实现目标的要求,这两种功能都由神经网络代表。我们展示了该方法在包括摆,卡车和无人机在内的各种系统上同时学习安全和目标政策的有效性。
This paper develops an approach to learn a policy of a dynamical system that is guaranteed to be both provably safe and goal-reaching. Here, the safety means that a policy must not drive the state of the system to any unsafe region, while the goal-reaching requires the trajectory of the controlled system asymptotically converges to a goal region (a generalization of stability). We obtain the safe and goal-reaching policy by jointly learning two additional certificate functions: a barrier function that guarantees the safety and a developed Lyapunov-like function to fulfill the goal-reaching requirement, both of which are represented by neural networks. We show the effectiveness of the method to learn both safe and goal-reaching policies on various systems, including pendulums, cart-poles, and UAVs.