迭代伪标记以进行语音识别

论文标题

迭代伪标记以进行语音识别

Iterative Pseudo-Labeling for Speech Recognition

论文作者

Xu, Qiantong, Likhomanenko, Tatiana, Kahn, Jacob, Hannun, Awni, Synnaeve, Gabriel, Collobert, Ronan

论文摘要

伪标记最近在端到端自动语音识别（ASR）中表现出了希望。我们研究了一种半监督算法（IPL），随着声学模型的发展，对未标记的数据有效进行了多次迭代。特别是，IPL使用标记的数据和未标记数据的子集微调每个迭代的现有模型。我们研究了IPL：使用语言模型和数据增强解码的主要组成部分。然后，我们通过在标准和低资源设置的LibrisPeech测试集上实现最先进的单词率率来证明IPL的有效性。我们还研究了对不同语料库训练的语言模型的效果，以表明IPL可以有效地利用其他文本。最后，我们发布了一个新的大型内域文本语料库

Pseudo-labeling has recently shown promise in end-to-end automatic speech recognition (ASR). We study Iterative Pseudo-Labeling (IPL), a semi-supervised algorithm which efficiently performs multiple iterations of pseudo-labeling on unlabeled data as the acoustic model evolves. In particular, IPL fine-tunes an existing model at each iteration using both labeled data and a subset of unlabeled data. We study the main components of IPL: decoding with a language model and data augmentation. We then demonstrate the effectiveness of IPL by achieving state-of-the-art word-error rate on the Librispeech test sets in both standard and low-resource setting. We also study the effect of language models trained on different corpora to show IPL can effectively utilize additional text. Finally, we release a new large in-domain text corpus which does not overlap with the Librispeech training transcriptions to foster research in low-resource, semi-supervised ASR

下载PDF全文

下载文献需遵守相关版权规定

论文标题