自动语音识别的动态稀疏神经网络

论文标题

自动语音识别的动态稀疏神经网络

Dynamic Sparsity Neural Networks for Automatic Speech Recognition

论文作者

Wu, Zhaofeng, Zhao, Ding, Liang, Qiao, Yu, Jiahui, Gulati, Anmol, Pang, Ruoming

论文摘要

在自动语音识别（ASR）中，模型修剪是一种广泛采用的技术，可降低模型大小和延迟，以在边缘设备上部署具有资源限制的神经网络模型。但是，通常需要单独培训具有不同稀疏度的多个模型，并部署到具有不同资源规格的异质目标硬件以及具有不同延迟需求的应用程序。在本文中，我们提出动态稀疏神经网络（DSNN），一旦训练，可以在运行时立即切换到任何预定义的稀疏配置。我们使用Google语音搜索数据在内部生产数据集上使用实验来证明DSNN的有效性和灵活性，并证明DSNN模型的性能与受过单独训练的单个稀疏网络的性能相当。因此，我们训练有素的DSNN模型可以极大地减轻培训过程，并在具有资源限制的各种情况下简化部署。

In automatic speech recognition (ASR), model pruning is a widely adopted technique that reduces model size and latency to deploy neural network models on edge devices with resource constraints. However, multiple models with different sparsity levels usually need to be separately trained and deployed to heterogeneous target hardware with different resource specifications and for applications that have various latency requirements. In this paper, we present Dynamic Sparsity Neural Networks (DSNN) that, once trained, can instantly switch to any predefined sparsity configuration at run-time. We demonstrate the effectiveness and flexibility of DSNN using experiments on internal production datasets with Google Voice Search data, and show that the performance of a DSNN model is on par with that of individually trained single sparsity networks. Our trained DSNN model, therefore, can greatly ease the training process and simplify deployment in diverse scenarios with resource constraints.

下载PDF全文

下载文献需遵守相关版权规定

论文标题