论文标题
一级学习朝着合成语音欺骗检测
One-class Learning Towards Synthetic Voice Spoofing Detection
论文作者
论文摘要
人的声音可以用来验证说话者的身份,但是自动扬声器验证(ASV)系统容易受到语音欺骗攻击的影响,例如模仿,重播,文本对语音和语音转换。最近,研究人员开发了反欺骗技术,以提高ASV系统的可靠性,以防止欺骗攻击。但是,大多数方法在检测实际使用中未知攻击方面遇到困难,这些攻击通常与已知攻击中的统计分布不同。特别是,合成语音欺骗算法的快速发展正在产生越来越强大的攻击,使ASV系统面临着看不见的攻击的风险。在这项工作中,我们提出了一个反欺骗系统,以检测未知的合成语音欺骗攻击(即,文本到语音或语音转换)使用一流的学习。关键的想法是压缩真正的语音表示形式,并注入一个角缘,以分离嵌入空间中的欺骗攻击。在不采用任何数据增强方法的情况下,我们提出的系统在ASVSPOOF 2019挑战逻辑访问方案的评估集上达到了相等的错误率(EER),均优于所有现有的单个系统(即没有模型集合的人)。
Human voices can be used to authenticate the identity of the speaker, but the automatic speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as impersonation, replay, text-to-speech, and voice conversion. Recently, researchers developed anti-spoofing techniques to improve the reliability of ASV systems against spoofing attacks. However, most methods encounter difficulties in detecting unknown attacks in practical use, which often have different statistical distributions from known attacks. Especially, the fast development of synthetic voice spoofing algorithms is generating increasingly powerful attacks, putting the ASV systems at risk of unseen attacks. In this work, we propose an anti-spoofing system to detect unknown synthetic voice spoofing attacks (i.e., text-to-speech or voice conversion) using one-class learning. The key idea is to compact the bona fide speech representation and inject an angular margin to separate the spoofing attacks in the embedding space. Without resorting to any data augmentation methods, our proposed system achieves an equal error rate (EER) of 2.19% on the evaluation set of ASVspoof 2019 Challenge logical access scenario, outperforming all existing single systems (i.e., those without model ensemble).