论文标题

QBSA:逻辑设计32位块链接的RSFQ算术逻辑单元

qBSA: Logic Design of a 32-bit Block-Skewed RSFQ Arithmetic Logic Unit

论文作者

Kundu, Souvik, datta, Gourav, Beerel, Peter A., Pedram, Massoud

论文摘要

单通量量子(SFQ)电路是一种有吸引力的超越电池技术,因为它们承诺在超过25 GHz的时钟频率下两个较低的功率较低级数。但是,每个SFQ门都在时钟时钟创建非常深的门级管道,尤其是在包括数据依赖数据的序列的序列中很难保持完整。本文提议通过重新设计数据路径以接受和操作最不重要的位(LSB)时钟周期,以增加SFQ管道的吞吐量,而不是更重要的位。这种偏斜的数据方法方法减少了LSB侧的延迟,可以在随后的数据依赖性操作中使用,以增加其吞吐量。特别是,我们建议将位分组为4位块,这些块同时操作,并为32位操作创建块链的DataPath单元。这种偏斜的方法允许随后的数据依赖性操作在第一个4位块完成后立即开始评估。使用这种通用方法,我们开发了与MIPS兼容的32位ALU。与先前提出的4位位夹和32位Ladner-Fischer Alus相比,我们的栅极级Verilog设计将32位数据依赖性操作的吞吐量提高了2倍和1.5倍。

Single flux quantum (SFQ) circuits are an attractive beyond-CMOS technology because they promise two orders of magnitude lower power at clock frequencies exceeding 25 GHz.However, every SFQ gate is clocked creating very deep gate-level pipelines that are difficult to keep full, particularly for sequences that include data-dependent operations. This paper proposes to increase the throughput of SFQ pipelines by re-designing the datapath to accept and operate on least-significant bits (LSBs) clock cycles earlier than more significant bits. This skewed datapath approach reduces the latency of the LSB side which can be feedback earlier for use in subsequent data-dependent operations increasing their throughput. In particular,we propose to group the bits into 4-bit blocks that are operatedon concurrently and create block-skewed datapath units for 32-bit operation. This skewed approach allows a subsequent data-dependent operation to start evaluating as soon as the first 4-bit block completes. Using this general approach, we developa block-skewed MIPS-compatible 32-bit ALU. Our gate-level Verilog design improves the throughput of 32-bit data dependent operations by 2x and 1.5x compared to previously proposed 4-bit bit-slice and 32-bit Ladner-Fischer ALUs respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源