为控制器验证产生对抗性干扰

论文标题

为控制器验证产生对抗性干扰

Generating Adversarial Disturbances for Controller Verification

论文作者

Ghai, Udaya, Snyder, David, Majumdar, Anirudha, Hazan, Elad

论文摘要

我们考虑为给定控制器生成最大对抗性干扰的问题，假设只有黑框访问它。我们为此问题提出了一种在线学习方法，该方法\ emph {自适应}会根据控制器选择的控制输入产生干扰。扰动发生器的目的是最大程度地减少\ emph {遗憾}与基准的干扰生成策略类别，即，与最佳可能的干扰发生器\ emph {In Bindsight}（从Benchmark策略类别中选择）相比，最大程度地提高控制器所产生的成本。在动力学是线性和成本是二次的环境中，我们将问题作为在线信任区域（OTR）问题，并带有内存的问题，并为此问题提供了新的在线学习算法（\ emph {motr}）。我们证明，这种方法与事后观察中最佳的干扰发生器竞争（从包括线性动力学的产生策略的丰富基准策略中选择）。我们在两个模拟示例上演示了我们的方法：（i）合成生成的线性系统，以及（ii）在AirSim模拟器中为流行的PX4控制器产生风干扰。在这些示例中，我们证明我们的方法的表现优于几种基线方法，包括$ h _ {\ infty} $干扰生成和基于梯度的方法。

We consider the problem of generating maximally adversarial disturbances for a given controller assuming only blackbox access to it. We propose an online learning approach to this problem that \emph{adaptively} generates disturbances based on control inputs chosen by the controller. The goal of the disturbance generator is to minimize \emph{regret} versus a benchmark disturbance-generating policy class, i.e., to maximize the cost incurred by the controller as well as possible compared to the best possible disturbance generator \emph{in hindsight} (chosen from a benchmark policy class). In the setting where the dynamics are linear and the costs are quadratic, we formulate our problem as an online trust region (OTR) problem with memory and present a new online learning algorithm (\emph{MOTR}) for this problem. We prove that this method competes with the best disturbance generator in hindsight (chosen from a rich class of benchmark policies that includes linear-dynamical disturbance generating policies). We demonstrate our approach on two simulated examples: (i) synthetically generated linear systems, and (ii) generating wind disturbances for the popular PX4 controller in the AirSim simulator. On these examples, we demonstrate that our approach outperforms several baseline approaches, including $H_{\infty}$ disturbance generation and gradient-based methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题