对深度学习后门对策的鲁棒性进行批判性评估

论文标题

对深度学习后门对策的鲁棒性进行批判性评估

Towards A Critical Evaluation of Robustness for Deep Learning Backdoor Countermeasures

论文作者

Qiu, Huming, Ma, Hua, Zhang, Zhi, Abuadbba, Alsharif, Kang, Wei, Fu, Anmin, Gao, Yansong

论文摘要

由于深度学习（DL）后门攻击已被揭示为最阴险的对抗性攻击之一，因此已经开发了许多对策，并在其各自的威胁模型中定义了某些假设。但是，这些对策的鲁棒性被无意间忽略了，这可能会引入严重的后果，例如，可以滥用对策并导致后门检测的错误含义。我们首次批判性地研究了现有的后门对策的鲁棒性，最初侧重于三种有影响力的模型引诱，这些模型引诱是神经清洁（S＆P'19），ABS（CCS'19）和MNTD（S＆P'21）。尽管这三个对策声称它们在各自的威胁模型下运作良好，但根据给定的任务，模型体系结构，数据集和国防超参数等因素，它们固有的未探索的非舒适案例，这些案例\ textit {甚至不是与精致的适应性攻击}}。我们证明了如何通过简单地改变上述因素来绕过它们与各自的威胁模型保持一致。特别是，对于每种辩护，正式的证明或实证研究都用于揭示其两个不持持鲁棒的案例，在这种情况下，它不像其声称或期望的那样强大，尤其是最近的MNTD。这项工作强调了必须彻底评估后门对策的鲁棒性，以避免在未知的非稳定案件中造成误导性的安全含义。

Since Deep Learning (DL) backdoor attacks have been revealed as one of the most insidious adversarial attacks, a number of countermeasures have been developed with certain assumptions defined in their respective threat models. However, the robustness of these countermeasures is inadvertently ignored, which can introduce severe consequences, e.g., a countermeasure can be misused and result in a false implication of backdoor detection. For the first time, we critically examine the robustness of existing backdoor countermeasures with an initial focus on three influential model-inspection ones that are Neural Cleanse (S&P'19), ABS (CCS'19), and MNTD (S&P'21). Although the three countermeasures claim that they work well under their respective threat models, they have inherent unexplored non-robust cases depending on factors such as given tasks, model architectures, datasets, and defense hyper-parameter, which are \textit{not even rooted from delicate adaptive attacks}. We demonstrate how to trivially bypass them aligned with their respective threat models by simply varying aforementioned factors. Particularly, for each defense, formal proofs or empirical studies are used to reveal its two non-robust cases where it is not as robust as it claims or expects, especially the recent MNTD. This work highlights the necessity of thoroughly evaluating the robustness of backdoor countermeasures to avoid their misleading security implications in unknown non-robust cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题