论文标题
ANT:用于低位深神经网络量化的自适应数值数据类型
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization
论文作者
论文摘要
量化是一种降低DNN模型的计算和记忆成本的技术,DNN模型越来越大。现有的量化解决方案使用固定点整数或浮点类类型,这些量子的好处有限,因为两者都需要更多的位以保持原始模型的准确性。另一方面,可变长度量化对正常值使用低位量化,而远距离量化的异常值则使用了高精度。即使这项工作带来了算法的好处,但由于长度的编码和解码,它也引入了重要的硬件开销。 在这项工作中,我们提出了一种称为ANT的固定长度自适应数值数据类型,以通过微小的硬件开销实现低位量化。我们的数据类型ANT利用了两个关键的创新来利用DNN模型中的张贴内和调整式自适应机会。首先,我们提出了一种特定的数据类型Flint,该数据类型结合了Float的优势和INT的优势,以适应张量中不同值的重要性。其次,我们提出了一个自适应框架,该框架根据其分布特性选择每个张量的最佳类型。我们为蚂蚁设计了一个统一的处理元件体系结构,并显示了与现有DNN加速器的易于集成。我们的设计导致2.8 $ \ times $速度和2.5 $ \ times $ $ $ $ $ \ $ \ times $ $ \ times $ $ \ times $ $ \ times $ $ \ times $比最先进的量化加速器提高了能源效率。
Quantization is a technique to reduce the computation and memory cost of DNN models, which are getting increasingly large. Existing quantization solutions use fixed-point integer or floating-point types, which have limited benefits, as both require more bits to maintain the accuracy of original models. On the other hand, variable-length quantization uses low-bit quantization for normal values and high-precision for a fraction of outlier values. Even though this line of work brings algorithmic benefits, it also introduces significant hardware overheads due to variable-length encoding and decoding. In this work, we propose a fixed-length adaptive numerical data type called ANT to achieve low-bit quantization with tiny hardware overheads. Our data type ANT leverages two key innovations to exploit the intra-tensor and inter-tensor adaptive opportunities in DNN models. First, we propose a particular data type, flint, that combines the advantages of float and int for adapting to the importance of different values within a tensor. Second, we propose an adaptive framework that selects the best type for each tensor according to its distribution characteristics. We design a unified processing element architecture for ANT and show its ease of integration with existing DNN accelerators. Our design results in 2.8$\times$ speedup and 2.5$\times$ energy efficiency improvement over the state-of-the-art quantization accelerators.