Lyapunov密度模型：限制基于学习的控制中的分布变化

论文标题

Lyapunov密度模型：限制基于学习的控制中的分布变化

Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

论文作者

Kang, Katie, Gradu, Paula, Choi, Jason, Janner, Michael, Tomlin, Claire, Levine, Sergey

论文摘要

在训练数据的分布中评估时，学到的模型和政策可以有效地概括，但可以在分布输入输入的情况下产生不可预测且错误的输出。为了避免在部署基于学习的控制算法时分配变化，我们寻求一种机制来将代理限制为类似于受过训练的国家和行动。在控制理论中，Lyapunov稳定性和控制不变的集合使我们能够保证稳定系统周围系统的控制器，而在机器学习中，密度模型使我们能够估算培训数据分布。我们可以将这两个概念结合起来，产生基于学习的控制算法，这些算法将系统仅使用分配动作限制为分布状态？在这项工作中，我们建议通过结合Lyapunov稳定性和密度估计的概念来做到这一点，引入Lyapunov密度模型：控制Lyapunov函数和密度模型的概括，从而提供了代理人在其整个轨迹上保持分布的能力的保证。

Learned models and policies can generalize effectively when evaluated within the distribution of the training data, but can produce unpredictable and erroneous outputs on out-of-distribution inputs. In order to avoid distribution shift when deploying learning-based control algorithms, we seek a mechanism to constrain the agent to states and actions that resemble those that it was trained on. In control theory, Lyapunov stability and control-invariant sets allow us to make guarantees about controllers that stabilize the system around specific states, while in machine learning, density models allow us to estimate the training data distribution. Can we combine these two concepts, producing learning-based control algorithms that constrain the system to in-distribution states using only in-distribution actions? In this work, we propose to do this by combining concepts from Lyapunov stability and density estimation, introducing Lyapunov density models: a generalization of control Lyapunov functions and density models that provides guarantees on an agent's ability to stay in-distribution over its entire trajectory.

下载PDF全文

下载文献需遵守相关版权规定

论文标题