低位移动网络，用于端到端口语理解

论文标题

低位移动网络，用于端到端口语理解

Low-bit Shift Network for End-to-End Spoken Language Understanding

论文作者

Avila, Anderson R., Bibi, Khalil, Yang, Rui Heng, Li, Xinlin, Xing, Chao, Chen, Xiao

论文摘要

深度神经网络（DNN）在多个领域取得了令人印象深刻的成功。多年来，随着更深层次，更复杂的体系结构的扩散，这些模型的准确性已经提高。因此，最新的解决方案通常在计算上很昂贵，这使得它们不适合在边缘计算平台上部署。为了减轻推断卷积神经网络（CNN）的高计算，内存和功率要求，我们提出了两个量化量化的使用，该量化量量化量量化的功率量化，该量化将连续参数量化为低位的两个值值。这通过删除昂贵的乘法操作和使用低位重量来降低计算复杂性。 Resnet被用作解决方案的基础，并根据口语理解（SLU）任务评估了所提出的模型。实验结果表明，转移神经网络体系结构的性能提高，我们的低位量化在测试集中达到了98.76 \％，这与其完整精确的对应物和最先进的解决方案相当。

Deep neural networks (DNN) have achieved impressive success in multiple domains. Over the years, the accuracy of these models has increased with the proliferation of deeper and more complex architectures. Thus, state-of-the-art solutions are often computationally expensive, which makes them unfit to be deployed on edge computing platforms. In order to mitigate the high computation, memory, and power requirements of inferring convolutional neural networks (CNNs), we propose the use of power-of-two quantization, which quantizes continuous parameters into low-bit power-of-two values. This reduces computational complexity by removing expensive multiplication operations and with the use of low-bit weights. ResNet is adopted as the building block of our solution and the proposed model is evaluated on a spoken language understanding (SLU) task. Experimental results show improved performance for shift neural network architectures, with our low-bit quantization achieving 98.76 \% on the test set which is comparable performance to its full-precision counterpart and state-of-the-art solutions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题