对深层检测的对抗威胁：一种实用的观点

论文标题

对深层检测的对抗威胁：一种实用的观点

Adversarial Threats to DeepFake Detection: A Practical Perspective

论文作者

Neekhara, Paarth, Dolhansky, Brian, Bitton, Joanna, Ferrer, Cristian Canton

论文摘要

面部操纵的图像和视频或深击可用于恶意使用误解或诽谤个人。因此，检测深击对于提高社交媒体平台和其他媒体共享网站的信誉至关重要。最先进的深层检测技术依赖于基于神经网络的分类模型，这些模型已知容易受到对抗性例子的影响。在这项工作中，我们从实用的角度研究了最先进的深层检测方法的脆弱性。我们在黑匣子设置中对深击探测器进行对抗性攻击，在该设置中，对手对分类模型没有完全了解。我们研究了对抗性扰动在不同模型上转移的程度，并提出了提高对抗性示例的可转移性的技术。我们还使用通用的对抗扰动创建了更容易访问的攻击，从而构成了非常可行的攻击场景，因为它们可以在攻击者之间很容易共享。我们对DeepFake检测挑战（DFDC）的获胜条目进行评估，并证明它们可以通过设计可转移且可访问的对抗性攻击来轻松绕过实际攻击方案。

Facially manipulated images and videos or DeepFakes can be used maliciously to fuel misinformation or defame individuals. Therefore, detecting DeepFakes is crucial to increase the credibility of social media platforms and other media sharing web sites. State-of-the art DeepFake detection techniques rely on neural network based classification models which are known to be vulnerable to adversarial examples. In this work, we study the vulnerabilities of state-of-the-art DeepFake detection methods from a practical stand point. We perform adversarial attacks on DeepFake detectors in a black box setting where the adversary does not have complete knowledge of the classification models. We study the extent to which adversarial perturbations transfer across different models and propose techniques to improve the transferability of adversarial examples. We also create more accessible attacks using Universal Adversarial Perturbations which pose a very feasible attack scenario since they can be easily shared amongst attackers. We perform our evaluations on the winning entries of the DeepFake Detection Challenge (DFDC) and demonstrate that they can be easily bypassed in a practical attack scenario by designing transferable and accessible adversarial attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题