论文标题
遗憾对线性动力学系统稳定性的含义
Implications of Regret on Stability of Linear Dynamical Systems
论文作者
论文摘要
在不确定性和动态限制下做出决策的代理商的设置对于最佳控制,强化学习以及最近在线学习的领域很常见。在在线学习环境中,经常通过遗憾的概念来量化代理商决定的质量,将所选决策的表现与事后最好的决策进行比较。尽管遗憾是一个有用的性能度量,但在涉及动态系统时,重要的是要评估闭环系统的稳定性对于所选的策略。在这项工作中,我们表明,对于遭受对抗性干扰的线性反馈策略和线性系统,线性遗憾意味着在时间变化和时间不变的环境中渐近稳定性。相反,我们还表明,有界的输入有限状态稳定性和状态过渡矩阵的总结性表示线性遗憾。
The setting of an agent making decisions under uncertainty and under dynamic constraints is common for the fields of optimal control, reinforcement learning, and recently also for online learning. In the online learning setting, the quality of an agent's decision is often quantified by the concept of regret, comparing the performance of the chosen decisions to the best possible ones in hindsight. While regret is a useful performance measure, when dynamical systems are concerned, it is important to also assess the stability of the closed-loop system for a chosen policy. In this work, we show that for linear state feedback policies and linear systems subject to adversarial disturbances, linear regret implies asymptotic stability in both time-varying and time-invariant settings. Conversely, we also show that bounded input bounded state stability and summability of the state transition matrices imply linear regret.