未来向量增强LVCSR的LSTM语言模型

论文标题

未来向量增强LVCSR的LSTM语言模型

Future Vector Enhanced LSTM Language Model for LVCSR

论文作者

Liu, Qi, Qian, Yanmin, Yu, Kai

论文摘要

语言模型（LM）在大型词汇连续语音识别（LVCSR）中起重要作用。但是，传统语言模型仅预测具有给定历史记录的下一个单词，而在LVCSR中通常需要一系列单词的连续预测。训练有素的单词预测建模与读取要求中的长期序列预测之间的不匹配可能导致性能退化。在本文中，提出了一种新颖的增强的长期短期记忆（LSTM）LM，并提出了未来向量。除了给定的历史外，其余的序列还将被未来的向量嵌入。将来的向量可以与LSTM LM合并，因此它具有建模更长的长期序列级别信息的能力。实验表明，提出的新LSTM LM在长期序列预测中获得了BLEU评分的更好结果。对于语音识别的撤退，尽管拟议的LSTM LM获得了很小的收益，但新模型似乎获得了与常规LSTM LM的互补互补的。使用新的和常规的LSTM LMS重新委员可以在单词错误率上取得很大的提高。

Language models (LM) play an important role in large vocabulary continuous speech recognition (LVCSR). However, traditional language models only predict next single word with given history, while the consecutive predictions on a sequence of words are usually demanded and useful in LVCSR. The mismatch between the single word prediction modeling in trained and the long term sequence prediction in read demands may lead to the performance degradation. In this paper, a novel enhanced long short-term memory (LSTM) LM using the future vector is proposed. In addition to the given history, the rest of the sequence will be also embedded by future vectors. This future vector can be incorporated with the LSTM LM, so it has the ability to model much longer term sequence level information. Experiments show that, the proposed new LSTM LM gets a better result on BLEU scores for long term sequence prediction. For the speech recognition rescoring, although the proposed LSTM LM obtains very slight gains, the new model seems obtain the great complementary with the conventional LSTM LM. Rescoring using both the new and conventional LSTM LMs can achieve a very large improvement on the word error rate.

下载PDF全文

下载文献需遵守相关版权规定

论文标题