论文标题
贝叶斯方法的可转移对手例子
Transferable Adversarial Examples with Bayes Approach
论文作者
论文摘要
深神经网络(DNNS)对黑盒对抗性攻击的脆弱性是值得信赖的AI中最激烈的话题之一。在这样的攻击中,攻击者在没有任何内部知识的情况下进行操作,从而使对抗性示例的跨模型可传递性至关重要。尽管对抗性实例有可能在各种模型中有效,但已经观察到专门为特定模型设计的对抗性例子通常表现出较差的可传递性。在本文中,我们通过贝叶斯方法的镜头探讨了对抗示例的可传递性。具体而言,我们利用贝叶斯的方法来探测可转移性,然后研究构成可转让性的事先的方法。此后,我们设计了两个具体的可转移性提前先验,以及从这些先知采样的实例的自适应动态加权策略。使用这些技术,我们提出了Bayatk。与现有的最新攻击相比,广泛的实验说明了Bayatk在针对未防御和辩护的黑盒模型制作更容易转移的对抗性示例中的显着有效性。
The vulnerability of deep neural networks (DNNs) to black-box adversarial attacks is one of the most heated topics in trustworthy AI. In such attacks, the attackers operate without any insider knowledge of the model, making the cross-model transferability of adversarial examples critical. Despite the potential for adversarial examples to be effective across various models, it has been observed that adversarial examples that are specifically crafted for a specific model often exhibit poor transferability. In this paper, we explore the transferability of adversarial examples via the lens of Bayesian approach. Specifically, we leverage Bayesian approach to probe the transferability and then study what constitutes a transferability-promoting prior. Following this, we design two concrete transferability-promoting priors, along with an adaptive dynamic weighting strategy for instances sampled from these priors. Employing these techniques, we present BayAtk. Extensive experiments illustrate the significant effectiveness of BayAtk in crafting more transferable adversarial examples against both undefended and defended black-box models compared to existing state-of-the-art attacks.