论文标题

减少和重建:低资源语音语言的ASR

Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages

论文作者

Diwan, Anuj, Jyothi, Preethi

论文摘要

这项工作提出了一种看似简单但有效的技术,可以改善语音语言的低资源ASR系统。通过在这些语言中识别一组听觉上相似的素图,我们首先使用语言上有意义的减少减少ASR系统的输出字母,然后使用独立的模块重建原始字母。我们证明,这减轻了负担,并改善了低资源的端到端ASR系统的性能(因为只需要降低标准的预测),并且可以设计一个非常简单但有效的重建模块,该模块可从降低的字母内从序列中恢复原始字母中的序列。我们提出了一个有限状态换能器的重建模块,该模块在还原的字母中以1好的ASR假设运行。我们证明了我们提出的技术使用ASR系统对两种印度语言(古吉拉特语和泰卢固语)的功效。仅访问10个小时的语音数据,与不使用任何减少的系统相比,我们获得的相对减少高达7%。

This work presents a seemingly simple but effective technique to improve low-resource ASR systems for phonetic languages. By identifying sets of acoustically similar graphemes in these languages, we first reduce the output alphabet of the ASR system using linguistically meaningful reductions and then reconstruct the original alphabet using a standalone module. We demonstrate that this lessens the burden and improves the performance of low-resource end-to-end ASR systems (because only reduced-alphabet predictions are needed) and that it is possible to design a very simple but effective reconstruction module that recovers sequences in the original alphabet from sequences in the reduced alphabet. We present a finite state transducer-based reconstruction module that operates on the 1-best ASR hypothesis in the reduced alphabet. We demonstrate the efficacy of our proposed technique using ASR systems for two Indian languages, Gujarati and Telugu. With access to only 10 hrs of speech data, we obtain relative WER reductions of up to 7% compared to systems that do not use any reduction.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源