论文标题

一项关于通过元音空间的跨语言文本到语音系统L2重音的实证研究

An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space

论文作者

Lee, Jihwan, Bae, Jae-Sung, Mun, Seongkyu, Choi, Heejin, Lee, Joun Yeop, Cho, Hoon-Young, Kim, Chanwoo

论文摘要

随着跨语性文本到语音(TTS)系统的最新发展,L2(第二语言或外国)的强调问题出现了。此外,对此类跨语言TTS系统进行主观评估很麻烦。元音空间分析通常用于探索包括L2口音在内的语言的各个方面,是一个很好的替代分析工具。在这项研究中,我们将元音空间分析方法应用于跨语义TTS系统的L2重音。通过元音空间分析,我们观察到以下三个以下内容:a)平行结构(Glow-TTS)比自动回归型(TACOTRON)的L2含量较小; B)L2重音在语言对中的非共享元音中更为主导; c)跨语言TTS系统的L2口音与人类L2学习者的现象共享一些现象。我们的发现表明,根据其语言特征,例如非共享元音,TTS系统必须以不同的方式处理每个语言对。他们还暗示,我们可以进一步将语言学知识纳入开发跨语言TTS系统中。

With the recent developments in cross-lingual Text-to-Speech (TTS) systems, L2 (second-language, or foreign) accent problems arise. Moreover, running a subjective evaluation for such cross-lingual TTS systems is troublesome. The vowel space analysis, which is often utilized to explore various aspects of language including L2 accents, is a great alternative analysis tool. In this study, we apply the vowel space analysis method to explore L2 accents of cross-lingual TTS systems. Through the vowel space analysis, we observe the three followings: a) a parallel architecture (Glow-TTS) is less L2-accented than an auto-regressive one (Tacotron); b) L2 accents are more dominant in non-shared vowels in a language pair; and c) L2 accents of cross-lingual TTS systems share some phenomena with those of human L2 learners. Our findings imply that it is necessary for TTS systems to handle each language pair differently, depending on their linguistic characteristics such as non-shared vowels. They also hint that we can further incorporate linguistics knowledge in developing cross-lingual TTS systems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源