论文标题

Mobilenets的副本量化

Subtensor Quantization for Mobilenets

论文作者

Dinh, Thu, Melnikov, Andrey, Daskalopoulos, Vasilios, Chai, Sek

论文摘要

深度神经网络(DNN)的量化已使开发人员能够部署具有更少内存和更有效的低功率推断的模型。但是,并非所有DNN设计都对量化友好。例如,对流行的Mobilenet架构进行了调整,以减少参数尺寸和计算潜伏期,并以可分离的深度卷积,但并非所有量化算法都可以很好地工作,并且准确性可能会违反其浮点版本。 In this paper, we analyzed several root causes of quantization loss and proposed alternatives that do not rely on per-channel or training-aware approaches.我们在Imagenet数据集上评估了图像分类任务,并在浮动点版本的0.7%以内量化了训练后的8位推断TOP-1精度。

Quantization for deep neural networks (DNN) have enabled developers to deploy models with less memory and more efficient low-power inference. However, not all DNN designs are friendly to quantization. For example, the popular Mobilenet architecture has been tuned to reduce parameter size and computational latency with separable depth-wise convolutions, but not all quantization algorithms work well and the accuracy can suffer against its float point versions. In this paper, we analyzed several root causes of quantization loss and proposed alternatives that do not rely on per-channel or training-aware approaches. We evaluate the image classification task on ImageNet dataset, and our post-training quantized 8-bit inference top-1 accuracy in within 0.7% of the floating point version.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源