从单个轨迹中学习为不稳定的线性二次调节器的稳定控制器

论文标题

从单个轨迹中学习为不稳定的线性二次调节器的稳定控制器

Learning Stabilizing Controllers for Unstable Linear Quadratic Regulators from a Single Trajectory

论文作者

Treven, Lenart, Curi, Sebastian, Mutny, Mojmir, Krause, Andreas

论文摘要

控制动态系统的主要任务是确保其稳定性。当系统未知时，强大的方法是有希望的，因为它们旨在同时稳定大量合理系统。我们在二次成本模型下研究线性控制器，也称为线性二次调节器（LQR）。我们提出了两个不同的半明确程序（SDP），该程序导致控制器稳定椭圆形不确定性集中的所有系统。我们进一步表明，所提出的SDP的可行性条件为\ emph {等效}。使用派生的鲁棒控制器合成，我们提出了一种有效的数据依赖性算法 - \ textsc {exploration} - 具有很高概率的情况，可以迅速识别稳定的控制器。我们的方法可用于初始化需要稳定控制器作为输入的现有算法，同时增加遗憾。我们进一步提出了不同的启发式方法，这些启发式方法可以从经验上减少\ textsc {exploration}采取的步骤数量，并在寻找稳定控制器时降低遭受的成本。

The principal task to control dynamical systems is to ensure their stability. When the system is unknown, robust approaches are promising since they aim to stabilize a large set of plausible systems simultaneously. We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR). We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set. We further show that the feasibility conditions of the proposed SDPs are \emph{equivalent}. Using the derived robust controller syntheses, we propose an efficient data dependent algorithm -- \textsc{eXploration} -- that with high probability quickly identifies a stabilizing controller. Our approach can be used to initialize existing algorithms that require a stabilizing controller as an input while adding constant to the regret. We further propose different heuristics which empirically reduce the number of steps taken by \textsc{eXploration} and reduce the suffered cost while searching for a stabilizing controller.

下载PDF全文

下载文献需遵守相关版权规定

论文标题