Taylorbeamixer：从梁空间词典的角度增强泰勒的全神经多渠道语音增强

论文标题

Taylorbeamixer：从梁空间词典的角度增强泰勒的全神经多渠道语音增强

TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary Perspective

论文作者

Li, Andong, Yu, Guochen, Liu, Wenzhe, Li, Xiaodong, Zheng, Chengshi

论文摘要

尽管在语音增强领域中，现有的全神经界定器的现有全神经束形式的表现令人鼓舞，但仍不清楚存在的基本机制。在本文中，我们从梁空间词典的角度重新审视了光束形成的行为，并将其提出到不同梁空间组件的学习和混合中。基于此，我们提出了一个称为TaylorBM的全神经束形式，以模拟Taylor的串联膨胀操作，其中0阶项用作进行梁混合的空间滤波器，并且几个高阶术语负责取消残留的噪声取消以进行后处理。整个系统被设计为以端到端的方式工作。实验是在空间化的Librispeech语料库上进行的，结果表明，所提出的方法在评估指标方面的表现优于现有的高级基准。

Despite the promising performance of existing frame-wise all-neural beamformers in the speech enhancement field, it remains unclear what the underlying mechanism exists. In this paper, we revisit the beamforming behavior from the beam-space dictionary perspective and formulate it into the learning and mixing of different beam-space components. Based on that, we propose an all-neural beamformer called TaylorBM to simulate Taylor's series expansion operation in which the 0th-order term serves as a spatial filter to conduct the beam mixing, and several high-order terms are tasked with residual noise cancellation for post-processing. The whole system is devised to work in an end-to-end manner. Experiments are conducted on the spatialized LibriSpeech corpus and results show that the proposed approach outperforms existing advanced baselines in terms of evaluation metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题