论文标题

Fracbnn:具有分数激活的准确和FPGA效率的二元神经网络

FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations

论文作者

Zhang, Yichi, Pan, Junhao, Liu, Xinheng, Chen, Hongzheng, Chen, Deming, Zhang, Zhiru

论文摘要

二元神经网络(BNN)具有1位的权重和激活。此类网络非常适合FPGA,因为它们的主要计算是位算术的,并且内存需求也大大减少。但是,与刚开始紧凑的卷积神经网络(CNN)模型相比,BNN倾向于在诸如ImageNet之类的现实数据集上产生更低的精度。此外,BNN的输入层逐渐成为主要的计算瓶颈,因为它通常被排除在二进制之外,以避免精确损失较大。这项工作提出了Fracbnn,该Fracbnn利用了分数激活,从而实质上提高了BNN的准确性。具体而言,我们的方法采用双精度激活方案,使用额外的稀疏二元卷积来计算最多两个位的特征。我们使用新型温度计编码进一步对输入层进行二手化。总体而言,Fracbnn保留了常规BNN的关键好处,其中所有卷积层均在纯二进制MAC操作(BMAC)中计算。我们为我们的新型BNN模型设计了一个有效的基于FPGA的加速器,该模型支持分数激活。为了评估在资源受限的方案下Fracbnn的性能,我们在嵌入式FPGA(Xilinx ultra96v2)上实现了整个优化的网络体系结构。我们在Imagenet上的实验表明,Fracbnn的精度与MobilenetV2相当,超过了FPGA上最著名的BNN设计,TOP-1的准确性增加了28.9%,模型尺寸降低了2.5倍。 Fracbnn还胜过最近引入的BNN模型,使用相同的模型大小,其TOP-1精度增加了2.4%。在嵌入式FPGA设备上,Fracbnn演示了实时图像分类的能力。

Binary neural networks (BNNs) have 1-bit weights and activations. Such networks are well suited for FPGAs, as their dominant computations are bitwise arithmetic and the memory requirement is also significantly reduced. However, compared to start-of-the-art compact convolutional neural network (CNN) models, BNNs tend to produce a much lower accuracy on realistic datasets such as ImageNet. In addition, the input layer of BNNs has gradually become a major compute bottleneck, because it is conventionally excluded from binarization to avoid a large accuracy loss. This work proposes FracBNN, which exploits fractional activations to substantially improve the accuracy of BNNs. Specifically, our approach employs a dual-precision activation scheme to compute features with up to two bits, using an additional sparse binary convolution. We further binarize the input layer using a novel thermometer encoding. Overall, FracBNN preserves the key benefits of conventional BNNs, where all convolutional layers are computed in pure binary MAC operations (BMACs). We design an efficient FPGA-based accelerator for our novel BNN model that supports the fractional activations. To evaluate the performance of FracBNN under a resource-constrained scenario, we implement the entire optimized network architecture on an embedded FPGA (Xilinx Ultra96v2). Our experiments on ImageNet show that FracBNN achieves an accuracy comparable to MobileNetV2, surpassing the best-known BNN design on FPGAs with an increase of 28.9% in top-1 accuracy and a 2.5x reduction in model size. FracBNN also outperforms a recently introduced BNN model with an increase of 2.4% in top-1 accuracy while using the same model size. On the embedded FPGA device, FracBNN demonstrates the ability of real-time image classification.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源