论文标题

通过全球和本地动力更快的非凸线联合学习

Faster Non-Convex Federated Learning via Global and Local Momentum

论文作者

Das, Rudrajit, Acharya, Anish, Hashemi, Abolfazl, Sanghavi, Sujay, Dhillon, Inderjit S., Topcu, Ufuk

论文摘要

我们提出了\ texttt {fedGlomo},这是一种新颖的联合学习(FL)算法,其迭代复杂性为$ \ Mathcal {o}(ε^{ - 1.5})$,以融合到$ε$ - $ - $ - 企业 - F(\ bm {x})\ |^2] \leqε$)用于平滑的非凸功能 - 在任意客户端的异质性和压缩通信下 - 与$ \ Mathcal {O}(ε^{ - 2})相比,大多数先前工作的复杂性。我们的关键算法想法是,可以实现这种改善的复杂性,这是基于以下观察结果:FL中的收敛性受到两个高方差来源的阻碍:(i)全球服务器聚集的步骤,具有多个本地更新,并因客户端异质性而加剧,并且(ii)当地客户级别级别级别的机构级别级别的噪音。通过将服务器汇总步骤建模为广义梯度型更新,我们提出了一个基于差异势头的全局更新,当该更新与客户端的方差降低的本地更新一起应用时,Enables \ texttt {fedGlomo}以享受提高的逆变率。此外,我们以一种新颖,更现实的客户型杂种性假设得出结果,该假设我们从经验上验证 - 与以前难以验证的假设不同。我们的实验说明了\ texttt {fedGlomo}的内在差异降低效果,该效应隐含地抑制了异质数据分布设置中的客户端饮用,并促进了通信效率。

We propose \texttt{FedGLOMO}, a novel federated learning (FL) algorithm with an iteration complexity of $\mathcal{O}(ε^{-1.5})$ to converge to an $ε$-stationary point (i.e., $\mathbb{E}[\|\nabla f(\bm{x})\|^2] \leq ε$) for smooth non-convex functions -- under arbitrary client heterogeneity and compressed communication -- compared to the $\mathcal{O}(ε^{-2})$ complexity of most prior works. Our key algorithmic idea that enables achieving this improved complexity is based on the observation that the convergence in FL is hampered by two sources of high variance: (i) the global server aggregation step with multiple local updates, exacerbated by client heterogeneity, and (ii) the noise of the local client-level stochastic gradients. By modeling the server aggregation step as a generalized gradient-type update, we propose a variance-reducing momentum-based global update at the server, which when applied in conjunction with variance-reduced local updates at the clients, enables \texttt{FedGLOMO} to enjoy an improved convergence rate. Moreover, we derive our results under a novel and more realistic client-heterogeneity assumption which we verify empirically -- unlike prior assumptions that are hard to verify. Our experiments illustrate the intrinsic variance reduction effect of \texttt{FedGLOMO}, which implicitly suppresses client-drift in heterogeneous data distribution settings and promotes communication efficiency.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源