论文标题
Taylorbeamixer:从梁空间词典的角度增强泰勒的全神经多渠道语音增强
TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary Perspective
论文作者
论文摘要
尽管在语音增强领域中,现有的全神经界定器的现有全神经束形式的表现令人鼓舞,但仍不清楚存在的基本机制。在本文中,我们从梁空间词典的角度重新审视了光束形成的行为,并将其提出到不同梁空间组件的学习和混合中。基于此,我们提出了一个称为TaylorBM的全神经束形式,以模拟Taylor的串联膨胀操作,其中0阶项用作进行梁混合的空间滤波器,并且几个高阶术语负责取消残留的噪声取消以进行后处理。整个系统被设计为以端到端的方式工作。实验是在空间化的Librispeech语料库上进行的,结果表明,所提出的方法在评估指标方面的表现优于现有的高级基准。
Despite the promising performance of existing frame-wise all-neural beamformers in the speech enhancement field, it remains unclear what the underlying mechanism exists. In this paper, we revisit the beamforming behavior from the beam-space dictionary perspective and formulate it into the learning and mixing of different beam-space components. Based on that, we propose an all-neural beamformer called TaylorBM to simulate Taylor's series expansion operation in which the 0th-order term serves as a spatial filter to conduct the beam mixing, and several high-order terms are tasked with residual noise cancellation for post-processing. The whole system is devised to work in an end-to-end manner. Experiments are conducted on the spatialized LibriSpeech corpus and results show that the proposed approach outperforms existing advanced baselines in terms of evaluation metrics.