神经Lyapunov重新设计

论文标题

神经Lyapunov重新设计

Neural Lyapunov Redesign

论文作者

Mehrjou, Arash, Ghavamzadeh, Mohammad, Schölkopf, Bernhard

论文摘要

仅基于性能指标的学习控制器已被证明在控制理论和强化学习中在许多物理和非物理任务中有效。但是，实际上，控制器必须保证某些安全概念，以确保其不会损害代理或环境。稳定性是安全的至关重要的概念，其违规肯定会导致不安全的行为。 Lyapunov功能是评估非线性动力学系统稳定性的有效工具。在本文中，我们以迭代方式将改进的Lyapunov功能与自动控制器合成结合在一起，以获得具有大型安全区域的控制策略。我们提出了一种两者协作算法，该算法在估计Lyapunov函数和得出逐渐扩大闭环系统稳定性区域的控制器之间进行交替。我们提供有关系统类别的理论结果，可以使用所提出的算法处理，并使用示例性的动力学系统对我们方法的有效性进行经验评估。

Learning controllers merely based on a performance metric has been proven effective in many physical and non-physical tasks in both control theory and reinforcement learning. However, in practice, the controller must guarantee some notion of safety to ensure that it does not harm either the agent or the environment. Stability is a crucial notion of safety, whose violation can certainly cause unsafe behaviors. Lyapunov functions are effective tools to assess stability in nonlinear dynamical systems. In this paper, we combine an improving Lyapunov function with automatic controller synthesis in an iterative fashion to obtain control policies with large safe regions. We propose a two-player collaborative algorithm that alternates between estimating a Lyapunov function and deriving a controller that gradually enlarges the stability region of the closed-loop system. We provide theoretical results on the class of systems that can be treated with the proposed algorithm and empirically evaluate the effectiveness of our method using an exemplary dynamical system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题