线性系统的随机陡峭下降方法：贪婪采样和动量

论文标题

线性系统的随机陡峭下降方法：贪婪采样和动量

Stochastic Steepest Descent Methods for Linear Systems: Greedy Sampling & Momentum

论文作者

Morshed, Md Sarowar, Ahmad, Sabbir, Noor-E-Alam, Md

论文摘要

最近提出的自适应草图和项目（SP）方法连接了几种知名投影方法，例如随机kaczmarz（RK），随机块Kaczmarz（RBK），Motzkin Halization（MR）放松（MR），随机坐标下降（RCD），Cappapped坐标血统（CCD）等，以求解求解型线性系统的框架。在这项工作中，我们首先提出了一个随机陡峭的下降（SSD）框架，该框架将SP方法与众所周知的最陡下降（SD）方法联系起来，以求解正定方程式的阳性线性线性系统。然后，我们在SSD框架中介绍了两种贪婪的抽样策略，使我们能够获得算法，例如对Kaczmarz Motzkin（SKM），采样块Kaczmarz（SBK），采样坐标坐标下降（SCD）等，在此过程中，我们将现有的采样规则概括为一个框架，并将其推广到一个框架中。此外，我们将Polyak动量技术纳入了SSD方法，以加速所得算法。我们为SSD方法和动量诱导的SSD方法提供了全局收敛结果。此外，我们证明了$ \ Mathcal {O}（\ frac {1} {k}）$收敛速率，用于两种方法生成的迭代率的cesaro平均值。通过SSD方法中的不同参数，我们获得了SD方法的经典收敛结果以及SP方法作为特殊情况。我们设计了计算实验，以证明所提出的贪婪采样方法的性能以及动量方法。提出的贪婪方法显着优于各种数据集的现有方法，例如随机测试实例以及现实世界中的数据集（LIBSVM，Matrix Market Collection的稀疏数据集）。最后，在这项工作中设计的动量算法加速了SSD方法的算法性能。

Recently proposed adaptive Sketch & Project (SP) methods connect several well-known projection methods such as Randomized Kaczmarz (RK), Randomized Block Kaczmarz (RBK), Motzkin Relaxation (MR), Randomized Coordinate Descent (RCD), Capped Coordinate Descent (CCD), etc. into one framework for solving linear systems. In this work, we first propose a Stochastic Steepest Descent (SSD) framework that connects SP methods with the well-known Steepest Descent (SD) method for solving positive-definite linear system of equations. We then introduce two greedy sampling strategies in the SSD framework that allow us to obtain algorithms such as Sampling Kaczmarz Motzkin (SKM), Sampling Block Kaczmarz (SBK), Sampling Coordinate Descent (SCD), etc. In doing so, we generalize the existing sampling rules into one framework and develop an efficient version of SP methods. Furthermore, we incorporated the Polyak momentum technique into the SSD method to accelerate the resulting algorithms. We provide global convergence results for both the SSD method and the momentum induced SSD method. Moreover, we prove $\mathcal{O}(\frac{1}{k})$ convergence rate for the Cesaro average of iterates generated by both methods. By varying parameters in the SSD method, we obtain classical convergence results of the SD method as well as the SP methods as special cases. We design computational experiments to demonstrate the performance of the proposed greedy sampling methods as well as the momentum methods. The proposed greedy methods significantly outperform the existing methods for a wide variety of datasets such as random test instances as well as real-world datasets (LIBSVM, sparse datasets from matrix market collection). Finally, the momentum algorithms designed in this work accelerate the algorithmic performance of the SSD methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题