DARTS-ASR：可区分架构搜索多语言语音识别和适应

论文标题

DARTS-ASR：可区分架构搜索多语言语音识别和适应

DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation

论文作者

Chen, Yi-Chen, Hsu, Jui-Yang, Lee, Cheng-Kuang, Lee, Hung-yi

论文摘要

在以前的工作中，仅在固定培养体架构下优化了ASR模型的参数权重。但是，成功的模型体系结构的设计始终依赖于人类的经验和直觉。此外，需要手动调整许多与模型体系结构相关的超参数。因此，在本文中，我们提出了一种ASR方法，其中具有有效的基于梯度的架构搜索Darts-ASR。为了检查DARTS-ASR的普遍性，我们不仅将方法应用于多种语言上的单语ASR，还将其应用于多语言ASR设置。在以前的工作之后，我们在多语言数据集IARPA Babel上进行了实验。实验结果表明，我们的方法在单语和多语言ASR设置下的角色错误率分别优于基线固定型结构相对降低10.2％和10.0％。此外，我们对DARTS-ASR的搜索架构进行了一些分析。

In previous works, only parameter weights of ASR models are optimized under fixed-topology architecture. However, the design of successful model architecture has always relied on human experience and intuition. Besides, many hyperparameters related to model architecture need to be manually tuned. Therefore in this paper, we propose an ASR approach with efficient gradient-based architecture search, DARTS-ASR. In order to examine the generalizability of DARTS-ASR, we apply our approach not only on many languages to perform monolingual ASR, but also on a multilingual ASR setting. Following previous works, we conducted experiments on a multilingual dataset, IARPA BABEL. The experiment results show that our approach outperformed the baseline fixed-topology architecture by 10.2% and 10.0% relative reduction on character error rates under monolingual and multilingual ASR settings respectively. Furthermore, we perform some analysis on the searched architectures by DARTS-ASR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题