论文标题
部分可观测时空混沌系统的无模型预测
Stateful Detection of Adversarial Reprogramming
论文作者
论文摘要
对抗性重编程允许通过重新利用机器学习模型来窃取计算资源,以执行攻击者选择的另一个任务。例如,可以将训练的用于识别动物图像的模型通过嵌入作为输入的图像中的对抗程序来重新编程以识别医学图像。即使目标模型是黑匣子,也可以对此攻击进行进行进行,认为机器学习模型是作为服务提供的,并且攻击者可以查询模型并收集其输出。到目前为止,在这种情况下还没有任何辩护有效。我们首次使用状态防御措施可以检测到此攻击,该防御量存储了对分类器的查询并检测到它们相似的异常情况。检测到恶意查询后,可以阻止其用户的帐户。因此,攻击者必须创建许多帐户以进行攻击。为了减少这个数字,攻击者可以针对替代分类器创建对抗程序,然后通过对目标模型进行少量查询来微调它。在这种情况下,国家防御的有效性降低了,但我们表明它仍然有效。
Adversarial reprogramming allows stealing computational resources by repurposing machine learning models to perform a different task chosen by the attacker. For example, a model trained to recognize images of animals can be reprogrammed to recognize medical images by embedding an adversarial program in the images provided as inputs. This attack can be perpetrated even if the target model is a black box, supposed that the machine-learning model is provided as a service and the attacker can query the model and collect its outputs. So far, no defense has been demonstrated effective in this scenario. We show for the first time that this attack is detectable using stateful defenses, which store the queries made to the classifier and detect the abnormal cases in which they are similar. Once a malicious query is detected, the account of the user who made it can be blocked. Thus, the attacker must create many accounts to perpetrate the attack. To decrease this number, the attacker could create the adversarial program against a surrogate classifier and then fine-tune it by making few queries to the target model. In this scenario, the effectiveness of the stateful defense is reduced, but we show that it is still effective.