论文标题

使用提示对嘈杂的线性二次调节器系统的自适应控制中的对数遗憾

Logarithmic Regret in Adaptive Control of Noisy Linear Quadratic Regulator Systems Using Hints

论文作者

Akbari, Mohammad, Gharesifard, Bahman, Linder, Tamas

论文摘要

研究了对线性季度系统的在线自适应控制的遗憾最小化问题。在此问题中,真正的系统过渡参数(矩阵$ a $ a $ a $ a $ $)是未知的,目的是设计和分析以sublinear后悔生成控制策略的算法。最近的研究表明,当系统参数完全未知时,存在这些参数的选择,以便任何仅使用过去系统轨迹中数据的算法充其量最能实现时间范围的平方根。但是,也已经知道,只有矩阵$ a $或仅矩阵$ b $是未知的(poly) - 静态遗憾是可以实现的。我们提出了一个结果,包括两个方案,表明当两个矩阵都未知时,(poly) - 静态的遗憾是可以实现的,但是会定期向控制器提供提示。

The problem of regret minimization for online adaptive control of linear-quadratic systems is studied. In this problem, the true system transition parameters (matrices $A$ and $B$) are unknown, and the objective is to design and analyze algorithms that generate control policies with sublinear regret. Recent studies show that when the system parameters are fully unknown, there exists a choice of these parameters such that any algorithm that only uses data from the past system trajectory at best achieves a square root of time horizon regret bound, providing a hard fundamental limit on the achievable regret in general. However, it is also known that (poly)-logarithmic regret is achievable when only matrix $A$ or only matrix $B$ is unknown. We present a result, encompassing both scenarios, showing that (poly)-logarithmic regret is achievable when both of these matrices are unknown, but a hint is periodically given to the controller.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源