论文标题

迭代伪标记以进行语音识别

Iterative Pseudo-Labeling for Speech Recognition

论文作者

Xu, Qiantong, Likhomanenko, Tatiana, Kahn, Jacob, Hannun, Awni, Synnaeve, Gabriel, Collobert, Ronan

论文摘要

伪标记最近在端到端自动语音识别(ASR)中表现出了希望。我们研究了一种半监督算法(IPL),随着声学模型的发展,对未标记的数据有效进行了多次迭代。特别是,IPL使用标记的数据和未标记数据的子集微调每个迭代的现有模型。我们研究了IPL:使用语言模型和数据增强解码的主要组成部分。然后,我们通过在标准和低资源设置的LibrisPeech测试集上实现最先进的单词率率来证明IPL的有效性。我们还研究了对不同语料库训练的语言模型的效果,以表明IPL可以有效地利用其他文本。最后,我们发布了一个新的大型内域文本语料库

Pseudo-labeling has recently shown promise in end-to-end automatic speech recognition (ASR). We study Iterative Pseudo-Labeling (IPL), a semi-supervised algorithm which efficiently performs multiple iterations of pseudo-labeling on unlabeled data as the acoustic model evolves. In particular, IPL fine-tunes an existing model at each iteration using both labeled data and a subset of unlabeled data. We study the main components of IPL: decoding with a language model and data augmentation. We then demonstrate the effectiveness of IPL by achieving state-of-the-art word-error rate on the Librispeech test sets in both standard and low-resource setting. We also study the effect of language models trained on different corpora to show IPL can effectively utilize additional text. Finally, we release a new large in-domain text corpus which does not overlap with the Librispeech training transcriptions to foster research in low-resource, semi-supervised ASR

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源