论文标题
通过神经网补丁产生有效的对抗输入
Efficient Adversarial Input Generation via Neural Net Patching
论文作者
论文摘要
对抗性投入的产生已成为建立深神经网的鲁棒性和可信赖性的关键问题,尤其是当它们用于安全至关重要的应用领域,例如自动驾驶汽车和精密医学。但是,该问题提出了多种实际挑战,包括由于大型网络而产生的可伸缩性问题,以及缺乏重要品质(例如自然性和输出性能)的对抗性输入的产生。该问题与修补神经网的任务共同具有最终目标,其中需要发现一些网络权重的小变化,以便在应用这些更改后,修改后的网络会为给定的一组输入产生理想的输出。我们通过建议从补丁获取对抗性输入来利用这种连接,而基本的观察结果是,改变权重的效果也可以通过更改输入来实现。因此,本文提出了一种新的方法来生成输入扰动,通过使用有效的网络修补技术对给定网络具有对抗性。我们注意到,所提出的方法比以前的最新技术更有效。
The generation of adversarial inputs has become a crucial issue in establishing the robustness and trustworthiness of deep neural nets, especially when they are used in safety-critical application domains such as autonomous vehicles and precision medicine. However, the problem poses multiple practical challenges, including scalability issues owing to large-sized networks, and the generation of adversarial inputs that lack important qualities such as naturalness and output-impartiality. This problem shares its end goal with the task of patching neural nets where small changes in some of the network's weights need to be discovered so that upon applying these changes, the modified net produces the desirable output for a given set of inputs. We exploit this connection by proposing to obtain an adversarial input from a patch, with the underlying observation that the effect of changing the weights can also be brought about by changing the inputs instead. Thus, this paper presents a novel way to generate input perturbations that are adversarial for a given network by using an efficient network patching technique. We note that the proposed method is significantly more effective than the prior state-of-the-art techniques.