论文标题
关键字发现的延迟控制
Latency Control for Keyword Spotting
论文作者
论文摘要
会话代理通常使用关键字发现(KWS)来启动与用户的语音交互。为了用户体验和隐私考虑,现有的KWS方法在很大程度上关注准确性,这通常可以以牺牲引入延迟为代价。为了解决这一权衡,我们提出了一种新的方法来控制KWS模型延迟,并在没有明确了解关键字端点的情况下将其推广到任何损失功能。通过单个可调的超参数,我们的方法使一个人能够平衡目标应用程序的检测潜伏期和准确性。从经验上讲,我们表明,与现有方法相比,我们的方法在潜伏期限制下具有出色的性能。也就是说,与基线的最新面积相比,我们对固定潜伏期目标进行了大量25 \%的相对错误接受改进。我们还表明,与交叉熵损失相比,当我们的方法与最大造成的损失结合使用时,我们能够在固定潜伏期时将相对错误接受提高25%。
Conversational agents commonly utilize keyword spotting (KWS) to initiate voice interaction with the user. For user experience and privacy considerations, existing approaches to KWS largely focus on accuracy, which can often come at the expense of introduced latency. To address this tradeoff, we propose a novel approach to control KWS model latency and which generalizes to any loss function without explicit knowledge of the keyword endpoint. Through a single, tunable hyperparameter, our approach enables one to balance detection latency and accuracy for the targeted application. Empirically, we show that our approach gives superior performance under latency constraints when compared to existing methods. Namely, we make a substantial 25\% relative false accepts improvement for a fixed latency target when compared to the baseline state-of-the-art. We also show that when our approach is used in conjunction with a max-pooling loss, we are able to improve relative false accepts by 25 % at a fixed latency when compared to cross entropy loss.