测量反馈控制具有最佳数据依赖的遗憾

论文标题

测量反馈控制具有最佳数据依赖的遗憾

Measurement-Feedback Control with Optimal Data-Dependent Regret

论文作者

Goel, Gautam, Hassibi, Babak

论文摘要

受到在线学习的启发，最近提出了与数据相关的遗憾，作为控制器设计的标准。在遗憾的是最佳控制范式中，因果控制者旨在最大程度地减少对假设的最佳非癌症控制器的遗憾，该基于对障碍序列的非族裔访问，它选择了全球成本最小化的控制动作的序列。我们将遗憾 - 最佳的控制扩展到更具挑战性的测量反馈设置，在线控制器必须在不直接观察州或驾驶扰动的情况下与最佳的非因果控制器竞争。我们表明，没有测量反馈控制器可以具有有限的竞争比或遗憾，而竞争比或遗憾是由测量干扰的路径长度界定的。但是，我们确实得出了一个控制器，其遗憾对驾驶和测量干扰的关节能量具有最佳的依赖，另一个遗憾的控制器对驾驶干扰的路径长度以及测量干扰的能量具有最佳的依赖。我们引入的关键技术是从遗憾 - 最佳的测量反馈控制减少到$ h _ {\ infty} $ - 合成系统中的最佳测量结果控制。我们提出数值模拟，以说明我们提出的对照算法的功效。

Inspired by online learning, data-dependent regret has recently been proposed as a criterion for controller design. In the regret-optimal control paradigm, causal controllers are designed to minimize regret against a hypothetical optimal noncausal controller, which selects the globally cost-minimizing sequence of control actions given noncausal access to the disturbance sequence. We extend regret-optimal control to the more challenging measurement-feedback setting, where the online controller must compete against the optimal noncausal controller without directly observing the state or the driving disturbance. We show that no measurement-feedback controller can have bounded competitive ratio or regret which is bounded by the pathlength of the measurement disturbance. We do derive, however, a controller whose regret has optimal dependence on the joint energy of the driving and measurement disturbances, and another controller whose regret has optimal dependence on the pathlength of the driving disturbance and the energy of the measurement disturbance. The key technique we introduce is a reduction from regret-optimal measurement-feedback control to $H_{\infty}$-optimal measurement-feedback control in a synthetic system. We present numerical simulations which illustrate the efficacy of our proposed control algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题