论文标题

Bagflip:针对数据中毒的认证辩护

BagFlip: A Certified Defense against Data Poisoning

论文作者

Zhang, Yuhao, Albarghouthi, Aws, D'Antoni, Loris

论文摘要

机器学习模型很容易受到数据毒作攻击的影响,在这种攻击中,攻击者恶意修改了训练设置以改变学习模型的预测。在无触发攻击中,攻击者可以修改训练集但不能修改测试输入,而在后门攻击中,攻击者还可以修改测试输入。现有的模型不足的防御方法要么无法处理后门攻击,要么不提供有效的证书(即防御证明)。我们提出了Bagflip,这是一种模型的认证方法,可以有效地防止无触发和后门攻击。我们评估了图像分类和恶意软件检测数据集的Bagflip。 Bagflip比无触发攻击的最先进方法等于或更有效,并且比用于后门攻击的最先进方法更有效。

Machine learning models are vulnerable to data-poisoning attacks, in which an attacker maliciously modifies the training set to change the prediction of a learned model. In a trigger-less attack, the attacker can modify the training set but not the test inputs, while in a backdoor attack the attacker can also modify test inputs. Existing model-agnostic defense approaches either cannot handle backdoor attacks or do not provide effective certificates (i.e., a proof of a defense). We present BagFlip, a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attacks. We evaluate BagFlip on image classification and malware detection datasets. BagFlip is equal to or more effective than the state-of-the-art approaches for trigger-less attacks and more effective than the state-of-the-art approaches for backdoor attacks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源