论文标题

DGC-VECTOR:一个新的扬声器嵌入零发音转换的新扬声器

DGC-vector: A new speaker embedding for zero-shot voice conversion

论文作者

Xiao, Ruitong, Zhang, Haitong, Lin, Yue

论文摘要

最近,已经提出了越来越多的零声音转换算法。作为零发音转换的基本组成部分,扬声器嵌入是改善转换后的演讲者相似性的关键。在本文中,我们研究了扬声器嵌入对零摄像语音转换性能的影响。为了更好地代表目标说话者的特征,并提高了零声音转换中的说话者相似性,我们在本文中提出了一种新颖的说话者表示方法。我们的方法结合了D-Vector,基于全球样式令牌(GST)的说话者表示和辅助监督的优势。客观和主观评估表明,所提出的方法在零声音转换上取得了不错的性能,并显着提高了与D-vector和基于GST的扬声器嵌入的说话者相似性。

Recently, more and more zero-shot voice conversion algorithms have been proposed. As a fundamental part of zero-shot voice conversion, speaker embeddings are the key to improving the converted speech's speaker similarity. In this paper, we study the impact of speaker embeddings on zero-shot voice conversion performance. To better represent the characteristics of the target speaker and improve the speaker similarity in zero-shot voice conversion, we propose a novel speaker representation method in this paper. Our method combines the advantages of D-vector, global style token (GST) based speaker representation and auxiliary supervision. Objective and subjective evaluations show that the proposed method achieves a decent performance on zero-shot voice conversion and significantly improves speaker similarity over D-vector and GST-based speaker embedding.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源