模拟计算中的准确性和弹性推理引擎

论文标题

模拟计算中的准确性和弹性推理引擎

Accuracy and Resiliency of Analog Compute-in-Memory Inference Engines

论文作者

Wan, Zhe, Wang, Tianyi, Zhou, Yiming, Iyer, Subramanian S., Roychowdhury, Vwani P.

论文摘要

最近，已经探索了基于新兴类似物非易失性记忆（NVM）技术的模拟计算 - 内存（CIM）架构，以提高能源效率，以提高深层神经网络（DNN）。但是，这种体系结构利用了电荷保护，一种无限分辨率的操作，因此容易受到错误的影响。因此，通过模拟NVM实现的DNN中的计算由于设备随机性而具有很高的不确定性。几份报告表明，模拟NVM对CIM的使用有限。目前尚不清楚计算中的不确定性是否会禁止大规模DNN。为了探讨这个关键的可扩展性问题，本文首先提出了一个模拟框架，以评估基于CIM架构和模拟NVM的大规模DNN的可行性。仿真结果表明，接受过高精度数字计算引擎训练的DNN对模拟NVM设备的不确定性没有弹性。为了避免这种灾难性的失败，本文介绍了DNN的模拟浮点数表示，以及Hessian Awance Ane Ane Ane Ancowate Antoctastic梯度下降（HA-SGD）训练算法，以提高受过训练的DNN的推理准确性。由于这种增强功能，证明DNN（例如CIFAR-100图像识别问题的广泛重新NN）可以在准确性上具有显着的性能提高，而不会增加推理硬件的成本。

Recently, analog compute-in-memory (CIM) architectures based on emerging analog non-volatile memory (NVM) technologies have been explored for deep neural networks (DNN) to improve energy efficiency. Such architectures, however, leverage charge conservation, an operation with infinite resolution, and thus are susceptible to errors. The computations in DNN realized by analog NVM thus have high uncertainty due to the device stochasticity. Several reports have demonstrated the use of analog NVM for CIM in a limited scale. It is unclear whether the uncertainties in computations will prohibit large-scale DNNs. To explore this critical issue of scalability, this paper first presents a simulation framework to evaluate the feasibility of large-scale DNNs based on CIM architecture and analog NVM. Simulation results show that DNNs trained for high-precision digital computing engines are not resilient against the uncertainty of the analog NVM devices. To avoid such catastrophic failures, this paper introduces the analog floating-point representation for the DNN, and the Hessian-Aware Stochastic Gradient Descent (HA-SGD) training algorithm to enhance the inference accuracy of trained DNNs. As a result of such enhancements, DNNs such as Wide ResNets for the CIFAR-100 image recognition problem are demonstrated to have significant performance improvements in accuracy without adding cost to the inference hardware.

下载PDF全文

下载文献需遵守相关版权规定

论文标题