论文标题
通过声学对抗性逃避机器学习的实时语音情感检测的隐私
Privacy against Real-Time Speech Emotion Detection via Acoustic Adversarial Evasion of Machine Learning
论文作者
论文摘要
智能扬声器语音助手(VAS),例如Amazon Echo和Google Home,由于它们与智能家居设备和物联网(IoT)技术的无缝集成,因此被广泛采用。这些VA服务引起了隐私问题,尤其是由于它们获得了我们的演讲。这项工作考虑了这样一种用例:通过语音情感识别(SER)对用户情绪的不可接受和未经授权的监视。本文介绍了Dare-GP,该解决方案为掩盖用户的情感信息产生加性噪音,同时保留其语音的转录相关部分。 Dare-GP通过使用约束的基因编程方法来学习描述目标用户情感内容的光谱频率特征,然后产生通用的对抗性音频扰动,从而提供此隐私保护。与现有的作品不同,DARE-GP提供:a)a)对以前闻所未闻的话语的实时保护,b)防止以前未见的黑盒ser分类器,c)在保护语音转录的同时,d)在现实的声学环境中这样做。此外,这种逃避对知识渊博的对手所采用的防御能力是有力的。这项工作的评估最终以声学评估对两个现成的商业智能扬声器进行了使用,使用小型因子(Raspberry Pi)与尾流系统集成,以评估其现实世界实时部署的效果。
Smart speaker voice assistants (VAs) such as Amazon Echo and Google Home have been widely adopted due to their seamless integration with smart home devices and the Internet of Things (IoT) technologies. These VA services raise privacy concerns, especially due to their access to our speech. This work considers one such use case: the unaccountable and unauthorized surveillance of a user's emotion via speech emotion recognition (SER). This paper presents DARE-GP, a solution that creates additive noise to mask users' emotional information while preserving the transcription-relevant portions of their speech. DARE-GP does this by using a constrained genetic programming approach to learn the spectral frequency traits that depict target users' emotional content, and then generating a universal adversarial audio perturbation that provides this privacy protection. Unlike existing works, DARE-GP provides: a) real-time protection of previously unheard utterances, b) against previously unseen black-box SER classifiers, c) while protecting speech transcription, and d) does so in a realistic, acoustic environment. Further, this evasion is robust against defenses employed by a knowledgeable adversary. The evaluations in this work culminate with acoustic evaluations against two off-the-shelf commercial smart speakers using a small-form-factor (raspberry pi) integrated with a wake-word system to evaluate the efficacy of its real-world, real-time deployment.