论文标题
块格式错误界限和最佳块大小选择
Block Format Error Bounds and Optimal Block Size Selection
论文作者
论文摘要
在过去的几年中,现代深层神经网络需要传输,处理和存储的数据量已经达到了真正的巨大量,呼吁在硬件和软件开发中发明新的范式。这里最有前途和迅速前进的前沿之一是创建新的数值格式。在这项工作中,我们专注于块浮点数值格式的家族,因为它们结合了使用简单整数算术的宽动态范围,数值准确性以及内部产品的有效硬件实现。这些格式的特征是一块具有共享比例因子的mantissas。基本块浮点(BFP)格式将块尺度量化为右侧两个的最接近功率。它的简单修改 - 缩放BFP(SBFP) - 以完全精确地存储相同的比例,因此可以更高的精度。在本文中,我们严格研究了这两种格式的统计行为。我们在SBFP和BFP量化的正态分布载体中的内部产品误差上发展了渐近界。接下来,我们将这些渐近结果改进到有限的维度设置,并针对相同的误差得出高维的紧密界限。根据获得的结果,我们引入了评估任何块格式准确性的性能度量。此措施使我们能够确定最佳参数,例如块大小,从而产生最高的精度。特别是,我们表明,如果BFP格式的精度固定为4位,则最佳块大小将变为64。所有理论推导都通过数值实验和对公开可用的预审预周化神经网络的权重的研究支持。
The amounts of data that need to be transmitted, processed, and stored by the modern deep neural networks have reached truly enormous volumes in the last few years calling for the invention of new paradigms both in hardware and software development. One of the most promising and rapidly advancing frontiers here is the creation of new numerical formats. In this work we focus on the family of block floating point numerical formats due to their combination of wide dynamic range, numerical accuracy, and efficient hardware implementation of inner products using simple integer arithmetic. These formats are characterized by a block of mantissas with a shared scale factor. The basic Block Floating Point (BFP) format quantizes the block scales into the nearest powers of two on the right. Its simple modification - Scaled BFP (SBFP) - stores the same scales in full precision and thus allows higher accuracy. In this paper, we study the statistical behavior of both these formats rigorously. We develop asymptotic bounds on the inner product error in SBFP- and BFP-quantized normally distributed vectors. Next, we refine those asymptotic results to finite dimensional settings and derive high-dimensional tight bounds for the same errors. Based on the obtained results we introduce a performance measure assessing accuracy of any block format. This measure allows us to determine the optimal parameters, such as the block size, yielding highest accuracy. In particular, we show that if the precision of the BFP format is fixed at 4 bits, the optimal block size becomes 64. All theoretical derivations are supported by numerical experiments and studies on the weights of publicly available pretrained neural networks.