联合参数和带宽分配，以提高分区边缘学习的效率

论文标题

联合参数和带宽分配，以提高分区边缘学习的效率

Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of Partitioned Edge Learning

论文作者

Wen, Dingzhu, Bennis, Mehdi, Huang, Kaibin

论文摘要

为了利用移动设备的数据和计算功能，机器学习算法被部署在培训人工智能（AI）模型的网络边缘，从而产生了新的边缘学习范式。在本文中，我们考虑了使用许多资源受限的设备（称为工人）迭代训练大规模模型的分区边缘学习框架。为此，在每次迭代中，该模型都会将模型动态分配到参数块中，将其下载到工人组中以使用数据子集更新。然后，将本地更新上传到服务器和级联以更新全局模型。为了通过最大程度地减少总学习延迟来减少资源使用，这项工作着重于参数（计算负载）分配和带宽分配的新型关节设计（用于下载和上传）。采用了两种设计方法。首先，一种实用的顺序方法，称为部分集成的参数和带宽分配（PABA），得出两个方案，即带宽意识参数分配和参数意识到的带宽分配。前者最大程度地减少了工人组最慢（计算）的负载，每个训练一个相同的参数块。后者将最大的带宽分配给工人是潜伏期瓶颈。其次，PABA被共同优化。尽管它是一个非凸的问题，但通过智能嵌套分分搜索和解决凸问题来得出有效且最佳的解决方案算法。使用真实数据的实验结果表明，整合PABA可以基本上改善延迟（例如46％）和准确性（例如4％）的延迟（例如46％）。

To leverage data and computation capabilities of mobile devices, machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models, resulting in the new paradigm of edge learning. In this paper, we consider the framework of partitioned edge learning for iteratively training a large-scale model using many resource-constrained devices (called workers). To this end, in each iteration, the model is dynamically partitioned into parametric blocks, which are downloaded to worker groups for updating using data subsets. Then, the local updates are uploaded to and cascaded by the server for updating a global model. To reduce resource usage by minimizing the total learning-and-communication latency, this work focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation (for downloading and uploading). Two design approaches are adopted. First, a practical sequential approach, called partially integrated parameter-and-bandwidth allocation (PABA), yields two schemes, namely bandwidth aware parameter allocation and parameter aware bandwidth allocation. The former minimizes the load for the slowest (in computing) of worker groups, each training a same parametric block. The latter allocates the largest bandwidth to the worker being the latency bottleneck. Second, PABA are jointly optimized. Despite its being a nonconvex problem, an efficient and optimal solution algorithm is derived by intelligently nesting a bisection search and solving a convex problem. Experimental results using real data demonstrate that integrating PABA can substantially improve the performance of partitioned edge learning in terms of latency (by e.g., 46%) and accuracy (by e.g., 4%).

下载PDF全文

下载文献需遵守相关版权规定

论文标题