基于动态注意力的生成对抗网络，具有阶段后处理，以增强语音

论文标题

基于动态注意力的生成对抗网络，具有阶段后处理，以增强语音

Dynamic Attention Based Generative Adversarial Network with Phase Post-Processing for Speech Enhancement

论文作者

Li, Andong, Zheng, Chengshi, Peng, Renhua, Fan, Cunhang, Li, Xiaodong

论文摘要

生成的对抗网络（GAN）最近促进了语音增强的发展。然而，与最先进的模型相比，性能优势仍然有限。在本文中，我们提出了一个强大的动态注意力递归gan，称为达根（Dargan），以减少时频域的降噪。与以前的作品不同，我们有几项创新。首先，递归学习是一种迭代培训协议，在发电机中使用，该协议由多个步骤组成。通过在每个步骤中重用网络，以逐步的方式逐渐减少噪声组件。其次，部署了动态注意机制，这有助于重新调整降噪模块中的特征分布。第三，我们利用深griffin-lim算法作为阶段后处理的模块，从而有助于进一步改善语音质量。语音银行语料库的实验结果表明，所提出的GAN比以前的基于GAN和非gan的模型实现了最先进的性能

The generative adversarial networks (GANs) have facilitated the development of speech enhancement recently. Nevertheless, the performance advantage is still limited when compared with state-of-the-art models. In this paper, we propose a powerful Dynamic Attention Recursive GAN called DARGAN for noise reduction in the time-frequency domain. Different from previous works, we have several innovations. First, recursive learning, an iterative training protocol, is used in the generator, which consists of multiple steps. By reusing the network in each step, the noise components are progressively reduced in a step-wise manner. Second, the dynamic attention mechanism is deployed, which helps to re-adjust the feature distribution in the noise reduction module. Third, we exploit the deep Griffin-Lim algorithm as the module for phase postprocessing, which facilitates further improvement in speech quality. Experimental results on Voice Bank corpus show that the proposed GAN achieves state-of-the-art performance than previous GAN- and non-GAN-based models

下载PDF全文

下载文献需遵守相关版权规定

论文标题