论文标题
针对语义细分的对抗贴片攻击的认证防御措施
Certified Defences Against Adversarial Patch Attacks on Semantic Segmentation
论文作者
论文摘要
对抗性补丁攻击是现实世界深度学习应用程序的新兴安全威胁。我们介绍了被掩盖的平滑,这是第一种方法(符合我们的知识),以证明语义分割模型与此威胁模型的鲁棒性。先前关于防御补丁攻击的辩护的工作主要集中在图像分类任务上,并且通常需要更改模型体系结构和其他培训,而这些培训是不受欢迎且计算上昂贵的。在被删除的平滑度中,可以在没有特定培训,微调或限制体系结构的情况下应用任何分割模型。使用不同的掩蔽策略,可以将拔掉的平滑措施应用于认证的检测和认证的恢复。在广泛的实验中,我们表明,在检测任务中,平均可以证明1%补丁的像素预测的64%,而在ADE20K数据集中恢复任务的0.5%补丁中有48%。
Adversarial patch attacks are an emerging security threat for real world deep learning applications. We present Demasked Smoothing, the first approach (up to our knowledge) to certify the robustness of semantic segmentation models against this threat model. Previous work on certifiably defending against patch attacks has mostly focused on image classification task and often required changes in the model architecture and additional training which is undesirable and computationally expensive. In Demasked Smoothing, any segmentation model can be applied without particular training, fine-tuning, or restriction of the architecture. Using different masking strategies, Demasked Smoothing can be applied both for certified detection and certified recovery. In extensive experiments we show that Demasked Smoothing can on average certify 64% of the pixel predictions for a 1% patch in the detection task and 48% against a 0.5% patch for the recovery task on the ADE20K dataset.