论文标题

FTRANS:使用FPGA的变压器的节能加速度

FTRANS: Energy-Efficient Acceleration of Transformers using FPGA

论文作者

Li, Bingbing, Pandey, Santosh, Fang, Haowen, Lyv, Yanjun, Li, Ji, Chen, Jieyang, Xie, Mimi, Wan, Lipeng, Liu, Hang, Ding, Caiwen

论文摘要

在自然语言处理(NLP)中,提出了“变压器”架构,作为第一个转导模型,该模型完全在不使用序列与序列的复发性神经网络(RNN)或卷积的情况下完全回复了自我注意解机制,并且它实现了对序列任务的序列的重大改进。这些预训练的语言表示的密集计算和存储将其普及到计算和内存受限的设备中。现场编程的门阵列(FPGA)广泛用于加速其高平行性和低潜伏期的深度学习算法。但是,训练有素的型号仍然太大,无法容纳FPGA面料。在本文中,我们为基于变压器的大规模语言表示,提出了一个有效的加速框架FTRAN。我们的框架包括基于增强的块电流矩阵(BCM)的权重表示,以在算法级别的大规模语言表示上启用模型压缩,而精确性降低很少,并且在体系结构级别进行加速设计。实验结果表明,我们提出的框架将NLP模型的模型大小显着降低了16次。与CPU相比,我们的FPGA设计的性能和能源效率提高了27.07倍和81倍,与GPU相比,能源效率提高了8.80倍。

In natural language processing (NLP), the "Transformer" architecture was proposed as the first transduction model replying entirely on self-attention mechanisms without using sequence-aligned recurrent neural networks (RNNs) or convolution, and it achieved significant improvements for sequence to sequence tasks. The introduced intensive computation and storage of these pre-trained language representations has impeded their popularity into computation and memory-constrained devices. The field-programmable gate array (FPGA) is widely used to accelerate deep learning algorithms for its high parallelism and low latency. However, the trained models are still too large to accommodate to an FPGA fabric. In this paper, we propose an efficient acceleration framework, Ftrans, for transformer-based large scale language representations. Our framework includes enhanced block-circulant matrix (BCM)-based weight representation to enable model compression on large-scale language representations at the algorithm level with few accuracy degradation, and an acceleration design at the architecture level. Experimental results show that our proposed framework significantly reduces the model size of NLP models by up to 16 times. Our FPGA design achieves 27.07x and 81x improvement in performance and energy efficiency compared to CPU, and up to 8.80x improvement in energy efficiency compared to GPU.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源