advest：对抗性扰动估计，以对说话者身份的对抗性攻击进行分类和检测

论文标题

advest：对抗性扰动估计，以对说话者身份的对抗性攻击进行分类和检测

AdvEst: Adversarial Perturbation Estimation to Classify and Detect Adversarial Attacks against Speaker Identification

论文作者

Joshi, Sonal, Kataria, Saurabh, Villalba, Jesus, Dehak, Najim

论文摘要

对抗性攻击对最先进的说话者身份识别系统构成了严重的安全威胁，从而使提出对其进行对策至关重要。在我们以前的工作中使用表示形式来对对抗性攻击进行分类和检测的基础，我们建议使用Advest（一种估计对抗性扰动的方法）对其进行改进。首先，我们证明了我们的说法，即使用对抗性扰动训练表示网络，而不是对抗性示例（由清洁信号和对抗性扰动的组合组成）是有益的，因为它消除了滋扰信息。在推论时，我们使用时间域Denoiser来估计对抗性示例的对抗性扰动。使用我们改进的表示学习方法获得攻击嵌入（签名），我们评估了它们的三种应用程序的性能：已知的攻击分类，攻击验证和未知攻击检测。我们表明，文献中的常见攻击（快速梯度符号方法（FGSM），预计梯度下降（PGD），具有不同LP威胁模型的Carlini-Wagner（CW）可以以〜96％的精度进行分类。我们还检测到〜9％的错误率（EER）的未知攻击，这是我们以前的工作的绝对提高约12％。

Adversarial attacks pose a severe security threat to the state-of-the-art speaker identification systems, thereby making it vital to propose countermeasures against them. Building on our previous work that used representation learning to classify and detect adversarial attacks, we propose an improvement to it using AdvEst, a method to estimate adversarial perturbation. First, we prove our claim that training the representation learning network using adversarial perturbations as opposed to adversarial examples (consisting of the combination of clean signal and adversarial perturbation) is beneficial because it eliminates nuisance information. At inference time, we use a time-domain denoiser to estimate the adversarial perturbations from adversarial examples. Using our improved representation learning approach to obtain attack embeddings (signatures), we evaluate their performance for three applications: known attack classification, attack verification, and unknown attack detection. We show that common attacks in the literature (Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), Carlini-Wagner (CW) with different Lp threat models) can be classified with an accuracy of ~96%. We also detect unknown attacks with an equal error rate (EER) of ~9%, which is absolute improvement of ~12% from our previous work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题