论文标题
深度学习能量消耗缩放法的晶体管操作模型
A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law
论文作者
论文摘要
深度学习(DL)改变了广泛行业的自动化,并发现社会上的普遍性日益增加。 DL模型的高复杂性及其广泛采用导致全球能源消耗每3-4个月加倍。当前,DL模型配置与能耗之间的关系尚未确定。在一般的计算能量模型水平上,硬件架构都有很强的依赖性(例如,具有不同配置的内部组件的通用处理器-CPU和GPU,CPU和GPU,可编程的集成电路 - FPGA),以及不同的交互能量消耗方面(例如,数据移动,计算,计算,控制,控制)。在DL模型级别上,我们需要将非线性激活功能及其与数据的相互作用转换为计算任务。当前方法主要使非线性DL模型近似于其理论拖放和MAC作为能源消耗的代理。但是,由于许多卷积神经网络(CNN)的高度非线性性质,这是不准确的(Est。93\%的精度)。 在本文中,我们开发了一种底层晶体管操作(TOS)方法,以揭示非线性激活函数和神经网络结构在能量消耗中的作用。我们将一系列馈电和CNN模型转换为ALU计算任务,然后将其转换为步骤。然后,通过回归模型,将其统计地链接到真实的能源消耗值,用于不同的硬件配置和数据集。我们表明,我们提出的TOS方法可以在预测其能耗时获得93.61%-99.51%的精度。
Deep Learning (DL) has transformed the automation of a wide range of industries and finds increasing ubiquity in society. The high complexity of DL models and its widespread adoption has led to global energy consumption doubling every 3-4 months. Currently, the relationship between the DL model configuration and energy consumption is not well established. At a general computational energy model level, there is both strong dependency to both the hardware architecture (e.g. generic processors with different configuration of inner components- CPU and GPU, programmable integrated circuits - FPGA), as well as different interacting energy consumption aspects (e.g., data movement, calculation, control). At the DL model level, we need to translate non-linear activation functions and its interaction with data into calculation tasks. Current methods mainly linearize nonlinear DL models to approximate its theoretical FLOPs and MACs as a proxy for energy consumption. Yet, this is inaccurate (est. 93\% accuracy) due to the highly nonlinear nature of many convolutional neural networks (CNNs) for example. In this paper, we develop a bottom-level Transistor Operations (TOs) method to expose the role of non-linear activation functions and neural network structure in energy consumption. We translate a range of feedforward and CNN models into ALU calculation tasks and then TO steps. This is then statistically linked to real energy consumption values via a regression model for different hardware configurations and data sets. We show that our proposed TOs method can achieve a 93.61% - 99.51% precision in predicting its energy consumption.