论文标题
学会发音中文而没有发音词典
Learning to Pronounce Chinese Without a Pronunciation Dictionary
论文作者
论文摘要
我们演示了一个程序,该程序学会在没有发音字典的情况下以普通话的形式发音中文文本。从汉字的非平行流和中文拼音音节中,它可以在字符和发音之间建立多一对数量的映射。该程序使用无监督的方法有效地将其写入语音。它的令牌级别的字符至音节精度为89%,大大超过了先前工作的22%精度。
We demonstrate a program that learns to pronounce Chinese text in Mandarin, without a pronunciation dictionary. From non-parallel streams of Chinese characters and Chinese pinyin syllables, it establishes a many-to-many mapping between characters and pronunciations. Using unsupervised methods, the program effectively deciphers writing into speech. Its token-level character-to-syllable accuracy is 89%, which significantly exceeds the 22% accuracy of prior work.