实时边缘分类：在令牌桶约束下的最佳卸载

论文标题

实时边缘分类：在令牌桶约束下的最佳卸载

Real-Time Edge Classification: Optimal Offloading under Token Bucket Constraints

论文作者

Chakrabarti, Ayan, Guérin, Roch, Lu, Chenyang, Liu, Jiangnan

论文摘要

要为具有严格延迟约束的实时应用程序部署基于机器学习的算法，我们考虑了一个边缘计算设置，其中将输入的子集卸载到边缘以通过精确但资源密集型的模型处理以进行处理，其余的仅由设备本身上较低的模型处理。这两种模型的计算成本都符合可用计算资源，并具有低延迟的处理输入。但是，卸载会导致网络延迟，并管理这些延迟以满足应用程序截止日期，我们使用令牌存储桶来限制设备的平均速率和爆发长度。我们介绍了基于马尔可夫决策过程的框架，以根据本地模型的置信度和令牌桶状态在这些约束下做出卸载决策，以最大程度地降低应用程序的指定误差度量。除了对单个设备的孤立决策之外，我们还提出了方法，以允许连接到同一访问开关的多个设备共享其破裂分配。我们在标准图像分类基准上使用我们的框架评估和分析了策略。

To deploy machine learning-based algorithms for real-time applications with strict latency constraints, we consider an edge-computing setting where a subset of inputs are offloaded to the edge for processing by an accurate but resource-intensive model, and the rest are processed only by a less-accurate model on the device itself. Both models have computational costs that match available compute resources, and process inputs with low-latency. But offloading incurs network delays, and to manage these delays to meet application deadlines, we use a token bucket to constrain the average rate and burst length of transmissions from the device. We introduce a Markov Decision Process-based framework to make offload decisions under these constraints, based on the local model's confidence and the token bucket state, with the goal of minimizing a specified error measure for the application. Beyond isolated decisions for individual devices, we also propose approaches to allow multiple devices connected to the same access switch to share their bursting allocation. We evaluate and analyze the policies derived using our framework on the standard ImageNet image classification benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题