论文标题
从单个轨迹中学习为不稳定的线性二次调节器的稳定控制器
Learning Stabilizing Controllers for Unstable Linear Quadratic Regulators from a Single Trajectory
论文作者
论文摘要
控制动态系统的主要任务是确保其稳定性。当系统未知时,强大的方法是有希望的,因为它们旨在同时稳定大量合理系统。我们在二次成本模型下研究线性控制器,也称为线性二次调节器(LQR)。我们提出了两个不同的半明确程序(SDP),该程序导致控制器稳定椭圆形不确定性集中的所有系统。我们进一步表明,所提出的SDP的可行性条件为\ emph {等效}。使用派生的鲁棒控制器合成,我们提出了一种有效的数据依赖性算法 - \ textsc {exploration} - 具有很高概率的情况,可以迅速识别稳定的控制器。我们的方法可用于初始化需要稳定控制器作为输入的现有算法,同时增加遗憾。我们进一步提出了不同的启发式方法,这些启发式方法可以从经验上减少\ textsc {exploration}采取的步骤数量,并在寻找稳定控制器时降低遭受的成本。
The principal task to control dynamical systems is to ensure their stability. When the system is unknown, robust approaches are promising since they aim to stabilize a large set of plausible systems simultaneously. We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR). We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set. We further show that the feasibility conditions of the proposed SDPs are \emph{equivalent}. Using the derived robust controller syntheses, we propose an efficient data dependent algorithm -- \textsc{eXploration} -- that with high probability quickly identifies a stabilizing controller. Our approach can be used to initialize existing algorithms that require a stabilizing controller as an input while adding constant to the regret. We further propose different heuristics which empirically reduce the number of steps taken by \textsc{eXploration} and reduce the suffered cost while searching for a stabilizing controller.