论文标题
稀疏的黑盒视频攻击通过增强学习
Sparse Black-box Video Attack with Reinforcement Learning
论文作者
论文摘要
最近探索了对视频识别模型的对抗性攻击。但是,大多数现有作品都平等地对待每个视频框架,而忽略了他们的时间互动。为了克服此缺点,一些方法尝试选择一些关键帧,然后根据它们执行攻击。不幸的是,他们的选择策略与进攻步骤无关,因此产生的性能是有限的。相反,我们认为框架选择阶段与攻击阶段密切相关。关键帧应根据攻击结果进行调整。为此,我们将Black-Box视频攻击制定为增强学习(RL)框架。具体而言,RL中的环境设置为识别模型,RL中的代理扮演着框架选择的角色。通过不断查询识别模型并收到攻击反馈,该代理会逐渐调整其框架选择策略和对抗性扰动越来越小。我们使用两个主流视频识别模型进行了一系列实验:公共UCF-101和HMDB-51数据集中的C3D和LRCN。结果表明,所提出的方法可以在有效的查询时间大大减少对抗性扰动。
Adversarial attacks on video recognition models have been explored recently. However, most existing works treat each video frame equally and ignore their temporal interactions. To overcome this drawback, a few methods try to select some key frames and then perform attacks based on them. Unfortunately, their selection strategy is independent of the attacking step, therefore the resulting performance is limited. Instead, we argue the frame selection phase is closely relevant with the attacking phase. The key frames should be adjusted according to the attacking results. For that, we formulate the black-box video attacks into a Reinforcement Learning (RL) framework. Specifically, the environment in RL is set as the recognition model, and the agent in RL plays the role of frame selecting. By continuously querying the recognition models and receiving the attacking feedback, the agent gradually adjusts its frame selection strategy and adversarial perturbations become smaller and smaller. We conduct a series of experiments with two mainstream video recognition models: C3D and LRCN on the public UCF-101 and HMDB-51 datasets. The results demonstrate that the proposed method can significantly reduce the adversarial perturbations with efficient query times.