跨域跨架构黑框对具有转移进化策略的微调模型的攻击

论文标题

跨域跨架构黑框对具有转移进化策略的微调模型的攻击

Cross-domain Cross-architecture Black-box Attacks on Fine-tuned Models with Transferred Evolutionary Strategies

论文作者

Zhang, Yinghua, Song, Yangqiu, Bai, Kun, Yang, Qiang

论文摘要

微调可能容易受到对抗攻击的攻击。现有关于对微调模型（BAFT）的黑盒攻击的作品受到强有力的假设的限制。为了填补空白，我们提出了两个新型的BAFT设置，即跨域和跨域交叉结构BAFT，这仅假定（1）攻击的目标模型是一个微调模型，并且（2）源域数据已知并且易于访问。为了成功攻击两种设置下的微调模型，我们建议先训练针对源模型的对抗发电机，该模型采用了编码器编码器体系结构，并将干净的输入映射到对抗性示例。然后，我们在对抗发电机的编码器产生的低维潜在空间中进行搜索。搜索是根据从源模型获得的替代梯度的指导进行的。对不同域和不同网络体系结构的实验结果表明，提出的攻击方法可以有效，有效地攻击微调模型。

Fine-tuning can be vulnerable to adversarial attacks. Existing works about black-box attacks on fine-tuned models (BAFT) are limited by strong assumptions. To fill the gap, we propose two novel BAFT settings, cross-domain and cross-domain cross-architecture BAFT, which only assume that (1) the target model for attacking is a fine-tuned model, and (2) the source domain data is known and accessible. To successfully attack fine-tuned models under both settings, we propose to first train an adversarial generator against the source model, which adopts an encoder-decoder architecture and maps a clean input to an adversarial example. Then we search in the low-dimensional latent space produced by the encoder of the adversarial generator. The search is conducted under the guidance of the surrogate gradient obtained from the source model. Experimental results on different domains and different network architectures demonstrate that the proposed attack method can effectively and efficiently attack the fine-tuned models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题