论文标题
可以转移多样性:白色和黑色盒子攻击的产出多元化
Diversity can be Transferred: Output Diversification for White- and Black-box Attacks
论文作者
论文摘要
对抗性攻击通常涉及从统一或高斯分布中绘制的输入的随机扰动,例如,以初始化基于优化的白色盒子攻击或在黑盒攻击中生成更新说明。但是,这些简单的扰动可能是最佳的,因为它们对所攻击的模型不可知。为了提高这些攻击的效率,我们提出了输出多样化的采样(ODS),这是一种新型的采样策略,试图在生成的样品中最大化目标模型输出中的多样性。虽然ODS是一种基于梯度的策略,但OD提供的多样性是可以转移的,并且可以通过替代模型对白盒和黑盒攻击有所帮助。从经验上讲,我们证明ODS显着改善了现有的白盒和黑盒攻击的性能。特别是,ODS减少了最先进的黑框攻击对图像网所需的查询数量。
Adversarial attacks often involve random perturbations of the inputs drawn from uniform or Gaussian distributions, e.g., to initialize optimization-based white-box attacks or generate update directions in black-box attacks. These simple perturbations, however, could be sub-optimal as they are agnostic to the model being attacked. To improve the efficiency of these attacks, we propose Output Diversified Sampling (ODS), a novel sampling strategy that attempts to maximize diversity in the target model's outputs among the generated samples. While ODS is a gradient-based strategy, the diversity offered by ODS is transferable and can be helpful for both white-box and black-box attacks via surrogate models. Empirically, we demonstrate that ODS significantly improves the performance of existing white-box and black-box attacks. In particular, ODS reduces the number of queries needed for state-of-the-art black-box attacks on ImageNet by a factor of two.