论文标题

对深度强化学习的隐秘和高效的对抗性攻击

Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning

论文作者

Sun, Jianwen, Zhang, Tianwei, Xie, Xiaofei, Ma, Lei, Zheng, Yan, Chen, Kangjie, Liu, Yang

论文摘要

对常规深度学习(DL)系统和算法的对抗性攻击已得到广泛研究,并提出了各种防御能力。但是,探索了这种攻击对深度强化学习(DRL)的可能性和可行性的探索较少。由于DRL在各种复杂的任务中取得了巨大的成功,因此设计有效的对抗性攻击是建立强大的DRL算法的必不可少的先决条件。在本文中,我们将两种新颖的对抗攻击技术介绍给\ emph {隐范}和\ emph {有效}攻击DRL代理。这两种技术使对手能够在最小的临界矩中注入对抗样品,同时造成对代理的最严重损害。第一个技术是\ emph {关键点攻击}:对手建立一个模型来预测未来的环境状态和代理的行动,评估每种可能的攻击策略的损害,并选择最佳攻击策略。第二种技术是\ emph {拮抗剂攻击}:对手会自动学习一个域 - 不可思议的模型,以发现一集中攻击代理的关键时刻。实验结果证明了我们技术的有效性。具体而言,要成功攻击DRL代理,我们的关键点技术仅需要1个(TORC)或2(Atari Pong和Breakout)步骤,并且对抗技术需要少于5个步骤(4个Mujoco任务),这比最新方法相比是重大改进。

Adversarial attacks against conventional Deep Learning (DL) systems and algorithms have been widely studied, and various defenses were proposed. However, the possibility and feasibility of such attacks against Deep Reinforcement Learning (DRL) are less explored. As DRL has achieved great success in various complex tasks, designing effective adversarial attacks is an indispensable prerequisite towards building robust DRL algorithms. In this paper, we introduce two novel adversarial attack techniques to \emph{stealthily} and \emph{efficiently} attack the DRL agents. These two techniques enable an adversary to inject adversarial samples in a minimal set of critical moments while causing the most severe damage to the agent. The first technique is the \emph{critical point attack}: the adversary builds a model to predict the future environmental states and agent's actions, assesses the damage of each possible attack strategy, and selects the optimal one. The second technique is the \emph{antagonist attack}: the adversary automatically learns a domain-agnostic model to discover the critical moments of attacking the agent in an episode. Experimental results demonstrate the effectiveness of our techniques. Specifically, to successfully attack the DRL agent, our critical point technique only requires 1 (TORCS) or 2 (Atari Pong and Breakout) steps, and the antagonist technique needs fewer than 5 steps (4 Mujoco tasks), which are significant improvements over state-of-the-art methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源