论文标题
基于DNN的掩模估计,用于空间不受限制麦克风阵列中的分布式语音增强
DNN-based mask estimation for distributed speech enhancement in spatially unconstrained microphone arrays
论文作者
论文摘要
麦克风阵列中的基于深层神经网络(DNN)基于语音增强算法现已证明是在嘈杂环境中的语音理解和语音识别的有效解决方案。但是,在临时麦克风阵列的背景下,仍然存在许多挑战,并提高了分布式处理的需求。在本文中,我们建议扩展一个先前引入的基于DNN的分布式时频掩盖估计方案,该方案可以有效地以所谓的压缩信号形式使用空间信息,这些信息是预滤波的目标估计。我们在现实的声学条件下研究了该算法的性能,并研究其最佳应用的实际方面。我们表明,麦克风阵列中的节点通过获利的空间覆盖范围来配合。我们还建议使用压缩信号不仅传达目标估计,而且传达噪声估计,以利用整个麦克风阵列中记录的声学多样性。
Deep neural network (DNN)-based speech enhancement algorithms in microphone arrays have now proven to be efficient solutions to speech understanding and speech recognition in noisy environments. However, in the context of ad-hoc microphone arrays, many challenges remain and raise the need for distributed processing. In this paper, we propose to extend a previously introduced distributed DNN-based time-frequency mask estimation scheme that can efficiently use spatial information in form of so-called compressed signals which are pre-filtered target estimations. We study the performance of this algorithm under realistic acoustic conditions and investigate practical aspects of its optimal application. We show that the nodes in the microphone array cooperate by taking profit of their spatial coverage in the room. We also propose to use the compressed signals not only to convey the target estimation but also the noise estimation in order to exploit the acoustic diversity recorded throughout the microphone array.