NVIDIA边缘板上神经网络的能耗：经验模型

论文标题

NVIDIA边缘板上神经网络的能耗：经验模型

Energy Consumption of Neural Networks on NVIDIA Edge Boards: an Empirical Model

论文作者

Lahmer, Seyyidahmed, Khoshsirat, Aria, Rossi, Michele, Zanella, Andrea

论文摘要

最近，有一种趋势是将深度学习推理任务的执行转移到网络边缘，更接近用户，以减少延迟并保留数据隐私。同时，日益增长的兴趣致力于机器学习的能力可持续性。因此，在这些趋势的交集中，我们发现了边缘机器学习的能量表征，这吸引了越来越多的关注。不幸的是，在推断过程中计算给定神经网络的能耗因可能的基础硬件实现的异质性而变得复杂。因此，在这项工作中，我们旨在分析某些现代边缘节点的推理任务的充满活力的消费，并得出简单但现实的模型。为此，我们进行了大量实验，以收集Nvidia的两个众所周知的边缘板上的卷积和完全连接的层的能量消耗，即Jetson TX2和Xavier。从测量值中，我们将一个简单，实用的模型提炼出来，该模型可以估算到所考虑的董事会上某些推理任务的能源消耗。我们认为，该模型可以在许多上下文中使用，例如，以指导在神经网络修剪中的启发式搜索中寻找有效的体系结构，或者在分裂计算环境中找到节能卸载策略，或者仅仅以评估深度神经网络架构的能量性能。

Recently, there has been a trend of shifting the execution of deep learning inference tasks toward the edge of the network, closer to the user, to reduce latency and preserve data privacy. At the same time, growing interest is being devoted to the energetic sustainability of machine learning. At the intersection of these trends, we hence find the energetic characterization of machine learning at the edge, which is attracting increasing attention. Unfortunately, calculating the energy consumption of a given neural network during inference is complicated by the heterogeneity of the possible underlying hardware implementation. In this work, we hence aim at profiling the energetic consumption of inference tasks for some modern edge nodes and deriving simple but realistic models. To this end, we performed a large number of experiments to collect the energy consumption of convolutional and fully connected layers on two well-known edge boards by NVIDIA, namely Jetson TX2 and Xavier. From the measurements, we have then distilled a simple, practical model that can provide an estimate of the energy consumption of a certain inference task on the considered boards. We believe that this model can be used in many contexts as, for instance, to guide the search for efficient architectures in Neural Architecture Search, as a heuristic in Neural Network pruning, or to find energy-efficient offloading strategies in a Split computing context, or simply to evaluate the energetic performance of Deep Neural Network architectures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题