用于加速深层神经网络的硬件和软件优化：对当前趋势，挑战和未来的道路的调查

论文标题

用于加速深层神经网络的硬件和软件优化：对当前趋势，挑战和未来的道路的调查

Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

论文作者

Capra, Maurizio, Bussolino, Beatrice, Marchisio, Alberto, Masera, Guido, Martina, Maurizio, Shafique, Muhammad

论文摘要

目前，机器学习（ML）在日常生活中变得无处不在。深度学习（DL）已经存在于许多应用程序中，从计算机愿景到医学的愿景到现代汽车的自动驾驶以及安全，医疗保健和金融领域的其他领域。但是，为了实现令人印象深刻的性能，这些算法在培训和推理时间内采用了非常深的网络，需要具有重要的计算能力。 DL模型的单一推断可能需要数十亿多个乘积和累积的操作，从而使DL极度耗能和耗能。在需要有限的能量和低潜伏期执行几种复杂的算法的情况下，需要能够实现能够实施能效的DL执行的具有成本效益的硬件平台。本文首先介绍了两个受脑启发的模型（例如深神经网络（DNN））和尖峰神经网络（SNN）的关键特性，然后分析技术以生成高效且高性能的设计。这项工作总结并比较了四个领先平台的作品，用于执行CPU，GPU，FPGA和ASIC等算法，描述了最先进的解决方案的主要解决方案，这使最近两个解决方案具有更大的突出性，因为它们具有更大的设计灵活性，并具有高能量效率的潜在潜在的潜在的解决方案，尤其是针对推进过程的潜力。除了硬件解决方案外，本文还讨论了这些DNN和SNN模型在执行过程中可能存在的一些重要安全问题，并提供了有关基准测试的全面部分，并解释了如何评估为其设计的不同网络和硬件系统的质量。

Currently, Machine Learning (ML) is becoming ubiquitous in everyday life. Deep Learning (DL) is already present in many applications ranging from computer vision for medicine to autonomous driving of modern cars as well as other sectors in security, healthcare, and finance. However, to achieve impressive performance, these algorithms employ very deep networks, requiring a significant computational power, both during the training and inference time. A single inference of a DL model may require billions of multiply-and-accumulated operations, making the DL extremely compute- and energy-hungry. In a scenario where several sophisticated algorithms need to be executed with limited energy and low latency, the need for cost-effective hardware platforms capable of implementing energy-efficient DL execution arises. This paper first introduces the key properties of two brain-inspired models like Deep Neural Network (DNN), and Spiking Neural Network (SNN), and then analyzes techniques to produce efficient and high-performance designs. This work summarizes and compares the works for four leading platforms for the execution of algorithms such as CPU, GPU, FPGA and ASIC describing the main solutions of the state-of-the-art, giving much prominence to the last two solutions since they offer greater design flexibility and bear the potential of high energy-efficiency, especially for the inference process. In addition to hardware solutions, this paper discusses some of the important security issues that these DNN and SNN models may have during their execution, and offers a comprehensive section on benchmarking, explaining how to assess the quality of different networks and hardware systems designed for them.

下载PDF全文

下载文献需遵守相关版权规定

论文标题