论文标题
MVNET:记忆援助和声音加强网络,用于增强语音
MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement
论文作者
论文摘要
语音增强可提高语音质量,并促进各种下游任务的性能。但是,当前大多数语音增强工作主要致力于改善下游自动语音识别(ASR)的性能,只有相对较小的工作着重于自动扬声器验证(ASV)任务。在这项工作中,我们提出了一个MVNET,该MVNET由一个记忆辅助模块组成,该模块可改善下游ASR的性能和声音增强模块,从而提高ASV的性能。此外,我们设计了一种新的损失功能来提高扬声器的声音相似性。 Libri2Mix数据集的实验结果表明,我们的方法在几种指标中的基线方法都优于基线方法,包括语音质量,清晰度和扬声器声乐相似性等。
Speech enhancement improves speech quality and promotes the performance of various downstream tasks. However, most current speech enhancement work was mainly devoted to improving the performance of downstream automatic speech recognition (ASR), only a relatively small amount of work focused on the automatic speaker verification (ASV) task. In this work, we propose a MVNet consisted of a memory assistance module which improves the performance of downstream ASR and a vocal reinforcement module which boosts the performance of ASV. In addition, we design a new loss function to improve speaker vocal similarity. Experimental results on the Libri2mix dataset show that our method outperforms baseline methods in several metrics, including speech quality, intelligibility, and speaker vocal similarity et al.