迭代的预处理以加快梯度探测方法：分布式线性最小二乘问题

论文标题

迭代的预处理以加快梯度探测方法：分布式线性最小二乘问题

Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem

论文作者

Chakrabarti, Kushal, Gupta, Nirupam, Chopra, Nikhil

论文摘要

本文考虑了服务器代理网络中的多代理线性最小二乘问题。在此问题中，该系统包括多个代理，每个代理都有一组连接到服务器的本地数据点。代理的目的是计算一个线性数学模型，该模型最佳地符合所有代理所持有的集体数据点，而无需共享其各个本地数据点。原则上，可以使用传统迭代梯度探测方法的服务器代理变体来实现此目标。梯度探测方法线性收敛到解决方案，其收敛速率受代理集体数据点的调节的限制。如果数据点不适合条件，则梯度降低方法可能需要大量的迭代来收敛。我们提出了一种迭代的预调节技术，该技术可以减轻数据点调节对梯度散发方法收敛速率的有害影响。我们严格地表明，当最小二乘问题具有独特的解决方案时，通过提出的迭代预处理的预先调节梯度探测方法可实现超线性收敛。通常，与传统的梯度变态方法和最新的加速梯度散发方法相比，收敛性具有线性，收敛速度提高。我们进一步说明了通过对无噪声和嘈杂的计算环境中不同现实世界中最小二乘问题的实验，我们提出的算法的收敛速率提高了。

This paper considers the multi-agent linear least-squares problem in a server-agent network. In this problem, the system comprises multiple agents, each having a set of local data points, that are connected to a server. The goal for the agents is to compute a linear mathematical model that optimally fits the collective data points held by all the agents, without sharing their individual local data points. This goal can be achieved, in principle, using the server-agent variant of the traditional iterative gradient-descent method. The gradient-descent method converges linearly to a solution, and its rate of convergence is lower bounded by the conditioning of the agents' collective data points. If the data points are ill-conditioned, the gradient-descent method may require a large number of iterations to converge. We propose an iterative pre-conditioning technique that mitigates the deleterious effect of the conditioning of data points on the rate of convergence of the gradient-descent method. We rigorously show that the resulting pre-conditioned gradient-descent method, with the proposed iterative pre-conditioning, achieves superlinear convergence when the least-squares problem has a unique solution. In general, the convergence is linear with improved rate of convergence in comparison to the traditional gradient-descent method and the state-of-the-art accelerated gradient-descent methods. We further illustrate the improved rate of convergence of our proposed algorithm through experiments on different real-world least-squares problems in both noise-free and noisy computation environment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题