论文标题
快速自适应联合双杆优化
Fast Adaptive Federated Bilevel Optimization
论文作者
论文摘要
二重性优化是机器学习中流行的层次结构模型,已广泛应用于许多机器学习任务,例如元学习,超参数学习和策略优化。尽管最近已经开发了许多二聚体优化算法,但是很少有自适应算法着重于分布式设置下的二元优化。众所周知,自适应梯度方法在分布式和非分布优化方面都表现出卓越的性能。因此,在本文中,我们提出了一种新型的自适应联合双光线优化算法(即ADAFBIO)来解决分布式的双层优化问题,其中上层(UL)问题的目标函数可能是非convex的,并且低级(LL)问题的问题是强烈的。具体而言,我们的ADAFBIO算法建立在基于动量的方差降低技术和局部SGD的基础上,以同时获得最著名的样本和通信复杂性。特别是,我们的ADAFBIO算法使用统一的自适应矩阵灵活地合并了各种自适应学习率,以更新UL和LL问题中的变量。此外,我们为ADAFBIO算法提供了一个收敛分析框架,并证明它需要$ \ tilde {o}的样本复杂性(ε^{ - 3})$,并具有$ \ tilde {o}的通信复杂性{o}(ε^{ - 2})$的通信复杂性才能获得$ε$ $ $ - $ sStationary point。联合超代表学习和联合数据超清洁任务的实验结果验证了我们算法的效率。
Bilevel optimization is a popular hierarchical model in machine learning, and has been widely applied to many machine learning tasks such as meta learning, hyperparameter learning and policy optimization. Although many bilevel optimization algorithms recently have been developed, few adaptive algorithm focuses on the bilevel optimization under the distributed setting. It is well known that the adaptive gradient methods show superior performances on both distributed and non-distributed optimization. In the paper, thus, we propose a novel adaptive federated bilevel optimization algorithm (i.e.,AdaFBiO) to solve the distributed bilevel optimization problems, where the objective function of Upper-Level (UL) problem is possibly nonconvex, and that of Lower-Level (LL) problem is strongly convex. Specifically, our AdaFBiO algorithm builds on the momentum-based variance reduced technique and local-SGD to obtain the best known sample and communication complexities simultaneously. In particular, our AdaFBiO algorithm uses the unified adaptive matrices to flexibly incorporate various adaptive learning rates to update variables in both UL and LL problems. Moreover, we provide a convergence analysis framework for our AdaFBiO algorithm, and prove it needs the sample complexity of $\tilde{O}(ε^{-3})$ with communication complexity of $\tilde{O}(ε^{-2})$ to obtain an $ε$-stationary point. Experimental results on federated hyper-representation learning and federated data hyper-cleaning tasks verify efficiency of our algorithm.