论文标题

MCNET:融合多通道语音增强的多个提示

McNet: Fuse Multiple Cues for Multichannel Speech Enhancement

论文作者

Yang, Yujie, Quan, Changsheng, Li, Xiaofei

论文摘要

在多通道语音增强中,光谱和空间信息对于区分语音和噪声至关重要。如何充分利用这两种类型的信息及其时间动态仍然是一个有趣的研究问题。作为解决此问题的解决方案,本文提出了一个名为MCNET的多提示融合网络,该网络分别汇总了四个模块,以分别利用全频段空间,狭窄的空间空间,次波光谱和全频段光谱信息。实验表明,所提出的网络中的每个模块都具有其独特的贡献,并且总体上尤其优于其他最新方法。

In multichannel speech enhancement, both spectral and spatial information are vital for discriminating between speech and noise. How to fully exploit these two types of information and their temporal dynamics remains an interesting research problem. As a solution to this problem, this paper proposes a multi-cue fusion network named McNet, which cascades four modules to respectively exploit the full-band spatial, narrow-band spatial, sub-band spectral, and full-band spectral information. Experiments show that each module in the proposed network has its unique contribution and, as a whole, notably outperforms other state-of-the-art methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源