通过自举的乘法噪声进行强大的基于学习的控制

论文标题

通过自举的乘法噪声进行强大的基于学习的控制

Robust Learning-Based Control via Bootstrapped Multiplicative Noise

论文作者

Gravell, Benjamin, Summers, Tyler

论文摘要

尽管数十年的研究和自适应控制和强化学习的最新进展，但在设计控制器方面仍然缺乏理解，这些控制器可为固有的非反应性不确定性提供鲁棒性，这些模型由有限的，嘈杂的数据估计。我们提出了一种强大的自适应控制算法，该算法将这种非质子不确定性明确地纳入控制设计。该算法具有三个组件：（1）最小二乘标称模型估计器；（2）一种量化名义模型估计值的非反应方差的自举重采样方法；（3）使用具有乘法噪声的最佳线性二次调节器（LQR）的非规定鲁棒控制设计方法。提出方法的关键优势是系统识别和健壮的控制设计过程均使用随机不确定性表示，因此实际固有的统计估计不确定性直接与稳健控制器设计的不确定性直接保持一致。我们通过数值实验表明，在预期的遗憾和遗憾风险的指标上，提出的强大自适应控制器可以大大优于确定性等效控制器。

Despite decades of research and recent progress in adaptive control and reinforcement learning, there remains a fundamental lack of understanding in designing controllers that provide robustness to inherent non-asymptotic uncertainties arising from models estimated with finite, noisy data. We propose a robust adaptive control algorithm that explicitly incorporates such non-asymptotic uncertainties into the control design. The algorithm has three components: (1) a least-squares nominal model estimator; (2) a bootstrap resampling method that quantifies non-asymptotic variance of the nominal model estimate; and (3) a non-conventional robust control design method using an optimal linear quadratic regulator (LQR) with multiplicative noise. A key advantage of the proposed approach is that the system identification and robust control design procedures both use stochastic uncertainty representations, so that the actual inherent statistical estimation uncertainty directly aligns with the uncertainty the robust controller is being designed against. We show through numerical experiments that the proposed robust adaptive controller can significantly outperform the certainty equivalent controller on both expected regret and measures of regret risk.

下载PDF全文

下载文献需遵守相关版权规定

论文标题