论文标题

非挥发性全旋转非二进制矩阵乘数:用于机器学习的高效硬件加速器

A Non-Volatile All-Spin Non-Binary Matrix Multiplier: An Efficient Hardware Accelerator for Machine Learning

论文作者

Rahman, Rahnuma, Bandyopadhyay, Supriyo

论文摘要

我们提出并分析了使用两个磁性隧道连接仪(一种被应变激活为乘数激活的乘法器),并通过spin orbit脉动激活,可以用作累积的操作,从而累积了动作。它比通常的基于横杆的电子非二元矩阵乘数具有两个优点。首先,虽然横梁架构需要N3设备来乘以两个矩阵,但我们只需要2N2个设备。其次,我们的矩阵乘数是非挥发性的,并保留了关闭电源后的产品矩阵的信息。在这里,我们提供了一个示例,可以在〜5 ns中执行每个MAC操作,并且每次操作的最大能量为〜60nmax AJ,其中NMAX是最大的矩阵大小。这为机器学习和人工智能任务提供了非常有用的硬件加速器,涉及大型矩阵的乘法。非挥发性允许矩阵倍增器嵌入功能强大的非Von-Neumann架构中,包括内存处理器。它还允许在减少访问云的需求的同时,在(things Internet)的边缘完成许多计算,从而使人工智能对网络攻击更具弹性。

We propose and analyze a compact and non-volatile nanomagnetic (all-spin) non-binary matrix multiplier performing the multiply-and-accumulate (MAC) operation using two magnetic tunnel junctions - one activated by strain to act as the multiplier, and the other activated by spin-orbit torque pulses to act as a domain wall synapse that performs the operation of the accumulator. It has two advantages over the usual crossbar-based electronic non-binary matrix multiplier. First, while the crossbar architecture requires N3 devices to multiply two matrices, we require only 2N2 devices. Second, our matrix multiplier is non-volatile and retains the information about the product matrix after being powered off. Here, we present an example where each MAC operation can be performed in ~5 ns and the maximum energy dissipated per operation is ~60Nmax aJ, where Nmax is the largest matrix size. This provides a very useful hardware accelerator for machine learning and artificial intelligence tasks which involve the multiplication of large matrices. The non-volatility allows the matrix multiplier to be embedded in powerful non-von-Neumann architectures, including processor-in-memory. It also allows much of the computing to be done at the edge (of internet-of-things) while reducing the need to access the cloud, thereby making artificial intelligence more resilient against cyberattacks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源