论文标题
在线最佳控制和仿射约束
Online Optimal Control with Affine Constraints
论文作者
论文摘要
本文考虑在线最佳控制,并具有对状态的仿射约束,并在线性动力学下具有有限的随机干扰。 The system dynamics and constraints are assumed to be known and time-invariant but the convex stage cost functions change adversarially. To solve this problem, we propose Online Gradient Descent with Buffer Zones (OGD-BZ).从理论上讲,我们表明具有适当参数的OGD-BZ可以确保尽管有任何可接受的干扰,可以确保系统满足所有约束。此外,我们研究了OGD-BZ的政策遗憾,该政策将OGD-BZ的性能与事后最佳线性政策的表现进行了比较。我们表明,OGD-BZ可以实现策略后悔的上限,这是地平线长度的平方根,乘以适当的算法参数下的地平线长度的某些对数项。
This paper considers online optimal control with affine constraints on the states and actions under linear dynamics with bounded random disturbances. The system dynamics and constraints are assumed to be known and time-invariant but the convex stage cost functions change adversarially. To solve this problem, we propose Online Gradient Descent with Buffer Zones (OGD-BZ). Theoretically, we show that OGD-BZ with proper parameters can guarantee the system to satisfy all the constraints despite any admissible disturbances. Further, we investigate the policy regret of OGD-BZ, which compares OGD-BZ's performance with the performance of the optimal linear policy in hindsight. We show that OGD-BZ can achieve a policy regret upper bound that is the square root of the horizon length multiplied by some logarithmic terms of the horizon length under proper algorithm parameters.