论文标题
分布式优化,通过ADMM平均和网络拓扑
Distributed Optimization, Averaging via ADMM, and Network Topology
论文作者
论文摘要
可扩展优化方法的必要性越来越多,尤其是由于现代机器学习应用程序中数据集和模型复杂性的爆炸爆炸。可扩展的求解器通常会通过处理单元网络分配计算。对于简单的算法,例如梯度下降,众所周知,收敛时间与该网络拓扑的依赖性是众所周知的。但是,对于更多涉及的算法,例如乘数的交替方向方法(ADMM),知之甚少。存在许多分布式优化算法的核心,存在一个八卦子例程,该子例程平均通过网络提供了本地信息,并且其效率对于该方法的整体性能至关重要。在本文中,我们回顾了该领域的最新研究,并为了隔离这种通信交流行为,我们将应用于规范分布式平均共识问题时进行比较。我们还显示了ADMM和抬起马尔可夫链之间有趣的连接,除了提供其收敛性和最佳参数调整的明确表征,从网络的光谱属性方面。最后,我们从经验上研究了在真实世界的传感器本地化问题上,不同算法的网络拓扑与收敛速率之间的联系。
There has been an increasing necessity for scalable optimization methods, especially due to the explosion in the size of datasets and model complexity in modern machine learning applications. Scalable solvers often distribute the computation over a network of processing units. For simple algorithms such as gradient descent the dependency of the convergence time with the topology of this network is well-known. However, for more involved algorithms such as the Alternating Direction Methods of Multipliers (ADMM) much less is known. At the heart of many distributed optimization algorithms there exists a gossip subroutine which averages local information over the network, and whose efficiency is crucial for the overall performance of the method. In this paper we review recent research in this area and, with the goal of isolating such a communication exchange behaviour, we compare different algorithms when applied to a canonical distributed averaging consensus problem. We also show interesting connections between ADMM and lifted Markov chains besides providing an explicitly characterization of its convergence and optimal parameter tuning in terms of spectral properties of the network. Finally, we empirically study the connection between network topology and convergence rates for different algorithms on a real world problem of sensor localization.