阿拉伯语 - 英语密码转换语音识别的文本数据扩展

论文标题

阿拉伯语 - 英语密码转换语音识别的文本数据扩展

Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition

论文作者

Hussein, Amir, Chowdhury, Shammur Absar, Abdelali, Ahmed, Dehak, Najim, Ali, Ahmed, Khudanpur, Sanjeev

论文摘要

口语内容中的内部内部代码转换（CS）的普遍性要求语音识别（ASR）系统处理混合语言。设计CS-ASR系统有许多挑战，这主要是由于数据稀缺，语法结构复杂性和域不匹配。解决CS的最常见方法是使用可用的转录CS语音以及单语言数据训练ASR系统。在这项工作中，我们通过用人工生成CS文本来增强单语言数据，为CS-ASR提出了零射击学习方法。我们基于随机词汇替换和等价约束（EC）的方法，同时利用对齐翻译对生成随机和语法有效的CS内容。我们的经验结果表明，语言模型的相对相对减少了65.5％，在两个生态有效的CS测试集上，ASR的相对降低为7.7％。人类对使用EC的生成文本的评估表明，超过80％的质量是足够的。

The pervasiveness of intra-utterance code-switching (CS) in spoken content requires that speech recognition (ASR) systems handle mixed language. Designing a CS-ASR system has many challenges, mainly due to data scarcity, grammatical structure complexity, and domain mismatch. The most common method for addressing CS is to train an ASR system with the available transcribed CS speech, along with monolingual data. In this work, we propose a zero-shot learning methodology for CS-ASR by augmenting the monolingual data with artificially generating CS text. We based our approach on random lexical replacements and Equivalence Constraint (EC) while exploiting aligned translation pairs to generate random and grammatically valid CS content. Our empirical results show a 65.5% relative reduction in language model perplexity, and 7.7% in ASR WER on two ecologically valid CS test sets. The human evaluation of the generated text using EC suggests that more than 80% is of adequate quality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题