适合联合学习的自适应聚合

论文标题

适合联合学习的自适应聚合

Adaptive Aggregation For Federated Learning

论文作者

Jayaram, K. R., Muthusamy, Vinod, Thomas, Gegi, Verma, Ashish, Purcell, Mark

论文摘要

联合学习（FL）算法的进步以及诸如差异隐私和同型加密等技术，导致FL越来越多地在许多应用领域中被采用和使用。这种提高的采用导致了FL工作的数量，规模（参与者/聚会的数量）和多样性（与主动方的间歇性与主动方）的迅速增长。许多基于集中式（通常是单个）模型聚合器的现有FL系统无法扩展以处理大型FL工作并适应各方的行为。在本文中，我们提出了用于FL聚合的新的可扩展和自适应架构。首先，我们演示了传统的基于树叠加的聚合技术（来自P2P，Publish-subscrips和流处理研究）如何有助于FL聚合量表，但从资源利用率和成本的角度来看是无效的。接下来，我们介绍ADAFED的设计和实现，该设计使用无服务器/云功能以资源有效且容忍的方式适应缩放聚合。我们描述了ADAFED如何仅在必要时才动态部署FL聚合，对弹性缩放以处理参与者的连接/叶，并且在（聚合）程序员端需要的最小努力是可容忍的。我们还证明，基于射线量表的原型将成千上万的参与者实现，并且能够减少资源需求和成本> 90％，对聚合潜伏期的影响最小。

Advances in federated learning (FL) algorithms,along with technologies like differential privacy and homomorphic encryption, have led to FL being increasingly adopted and used in many application domains. This increasing adoption has led to rapid growth in the number, size (number of participants/parties) and diversity (intermittent vs. active parties) of FL jobs. Many existing FL systems, based on centralized (often single) model aggregators are unable to scale to handle large FL jobs and adapt to parties' behavior. In this paper, we present a new scalable and adaptive architecture for FL aggregation. First, we demonstrate how traditional tree overlay based aggregation techniques (from P2P, publish-subscribe and stream processing research) can help FL aggregation scale, but are ineffective from a resource utilization and cost standpoint. Next, we present the design and implementation of AdaFed, which uses serverless/cloud functions to adaptively scale aggregation in a resource efficient and fault tolerant manner. We describe how AdaFed enables FL aggregation to be dynamically deployed only when necessary, elastically scaled to handle participant joins/leaves and is fault tolerant with minimal effort required on the (aggregation) programmer side. We also demonstrate that our prototype based on Ray scales to thousands of participants, and is able to achieve a >90% reduction in resource requirements and cost, with minimal impact on aggregation latency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题