深度学习能量消耗缩放法的晶体管操作模型

论文标题

深度学习能量消耗缩放法的晶体管操作模型

A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law

论文作者

Li, Chen, Tsourdos, Antonios, Guo, Weisi

论文摘要

深度学习（DL）改变了广泛行业的自动化，并发现社会上的普遍性日益增加。 DL模型的高复杂性及其广泛采用导致全球能源消耗每3-4个月加倍。当前，DL模型配置与能耗之间的关系尚未确定。在一般的计算能量模型水平上，硬件架构都有很强的依赖性（例如，具有不同配置的内部组件的通用处理器-CPU和GPU，CPU和GPU，可编程的集成电路 - FPGA），以及不同的交互能量消耗方面（例如，数据移动，计算，计算，控制，控制）。在DL模型级别上，我们需要将非线性激活功能及其与数据的相互作用转换为计算任务。当前方法主要使非线性DL模型近似于其理论拖放和MAC作为能源消耗的代理。但是，由于许多卷积神经网络（CNN）的高度非线性性质，这是不准确的（Est。93\％的精度）。在本文中，我们开发了一种底层晶体管操作（TOS）方法，以揭示非线性激活函数和神经网络结构在能量消耗中的作用。我们将一系列馈电和CNN模型转换为ALU计算任务，然后将其转换为步骤。然后，通过回归模型，将其统计地链接到真实的能源消耗值，用于不同的硬件配置和数据集。我们表明，我们提出的TOS方法可以在预测其能耗时获得93.61％-99.51％的精度。

Deep Learning (DL) has transformed the automation of a wide range of industries and finds increasing ubiquity in society. The high complexity of DL models and its widespread adoption has led to global energy consumption doubling every 3-4 months. Currently, the relationship between the DL model configuration and energy consumption is not well established. At a general computational energy model level, there is both strong dependency to both the hardware architecture (e.g. generic processors with different configuration of inner components- CPU and GPU, programmable integrated circuits - FPGA), as well as different interacting energy consumption aspects (e.g., data movement, calculation, control). At the DL model level, we need to translate non-linear activation functions and its interaction with data into calculation tasks. Current methods mainly linearize nonlinear DL models to approximate its theoretical FLOPs and MACs as a proxy for energy consumption. Yet, this is inaccurate (est. 93\% accuracy) due to the highly nonlinear nature of many convolutional neural networks (CNNs) for example. In this paper, we develop a bottom-level Transistor Operations (TOs) method to expose the role of non-linear activation functions and neural network structure in energy consumption. We translate a range of feedforward and CNN models into ALU calculation tasks and then TO steps. This is then statistically linked to real energy consumption values via a regression model for different hardware configurations and data sets. We show that our proposed TOs method can achieve a 93.61% - 99.51% precision in predicting its energy consumption.

下载PDF全文

下载文献需遵守相关版权规定

论文标题