边缘的AI：使用专业边缘体系结构重新思考基于AI的IoT应用程序

论文标题

边缘的AI：使用专业边缘体系结构重新思考基于AI的IoT应用程序

AI on the Edge: Rethinking AI-based IoT Applications Using Specialized Edge Architectures

论文作者

Liang, Qianlin, Shenoy, Prashant, Irwin, David

论文摘要

Edge Computing已成为一种流行的范式，用于支持低潜伏期或高带宽需求的移动和物联网应用程序。由于最近有特殊用途的硬件可在边缘节点上加速特定的计算任务（例如深度学习推断），因此边缘计算的吸引力得到了进一步增强。在本文中，我们通过实验将使用Edge Accelerator构建的专业边缘系统与更传统的边缘和云计算形式进行了比较。我们使用基于边缘的AI工作负载的实验研究表明，与传统的边缘和云服务器相比，当今的Edge加速器可以提供可比性，在许多情况下以功率或成本归一化的性能。他们还为使用模型压缩或模型分割时，为分裂处理，跨越层内和内部的分裂处理提供了延迟和带宽优势，但需要动态方法来确定跨层的最佳分裂。我们发现，边缘加速器可以为多租户推理应用提供不同程度的并发性，但是缺乏边缘云多租户托管所需的隔离机制。

Edge computing has emerged as a popular paradigm for supporting mobile and IoT applications with low latency or high bandwidth needs. The attractiveness of edge computing has been further enhanced due to the recent availability of special-purpose hardware to accelerate specific compute tasks, such as deep learning inference, on edge nodes. In this paper, we experimentally compare the benefits and limitations of using specialized edge systems, built using edge accelerators, to more traditional forms of edge and cloud computing. Our experimental study using edge-based AI workloads shows that today's edge accelerators can provide comparable, and in many cases better, performance, when normalized for power or cost, than traditional edge and cloud servers. They also provide latency and bandwidth benefits for split processing, across and within tiers, when using model compression or model splitting, but require dynamic methods to determine the optimal split across tiers. We find that edge accelerators can support varying degrees of concurrency for multi-tenant inference applications, but lack isolation mechanisms necessary for edge cloud multi-tenant hosting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题