梯度问题的梯度方法的停止规则

论文标题

梯度问题的梯度方法的停止规则

Stopping Rules for Gradient Methods for Non-Convex Problems with Additive Noise in Gradient

论文作者

Polyak, Boris T., Kuruzov, Ilia A., Stonyakin, Fedor S.

论文摘要

我们研究了梯度方法的假设，即一般来说，添加性不精确梯度可用于非凸问题。目标函数的非跨性别性以及在迭代时指定梯度的使用，可能会导致各种问题。例如，梯度方法的轨迹距离起点可能足够远。另一方面，在存在噪声的情况下，无界去除梯度方法的轨迹可能会导致从所需的精确溶液中去除该方法的轨迹。研究梯度方法轨迹行为的结果是在梯度不确定的假设和梯度优势条件下获得的。众所周知，这种情况对于许多重要的非凸问题有效。此外，它可以为梯度方法提供良好的复杂性。提出了提出梯度方法的早期停止规则。首先，它可以保证在功能方面达到该方法的出口点的可接受质量。其次，停止规则可确保该点与所选初始位置的相当中等距离。除了具有恒定步骤的梯度方法外，其具有自适应步长大小的变体还进行了详细研究，这使得在未知的Lipschitz常数为梯度的情况下，可以应用开发的技术。已经进行了一些计算实验，以证明所研究梯度方法提出的停止规则的有效性。

We study the gradient method under the assumption that an additively inexact gradient is available for, generally speaking, non-convex problems. The non-convexity of the objective function, as well as the use of an inexactness specified gradient at iterations, can lead to various problems. For example, the trajectory of the gradient method may be far enough away from the starting point. On the other hand, the unbounded removal of the trajectory of the gradient method in the presence of noise can lead to the removal of the trajectory of the method from the desired exact solution. The results of investigating the behavior of the trajectory of the gradient method are obtained under the assumption of the inexactness of the gradient and the condition of gradient dominance. It is well known that such a condition is valid for many important non-convex problems. Moreover, it leads to good complexity guarantees for the gradient method. A rule of early stopping of the gradient method is proposed. Firstly, it guarantees achieving an acceptable quality of the exit point of the method in terms of the function. Secondly, the stopping rule ensures a fairly moderate distance of this point from the chosen initial position. In addition to the gradient method with a constant step, its variant with adaptive step size is also investigated in detail, which makes it possible to apply the developed technique in the case of an unknown Lipschitz constant for the gradient. Some computational experiments have been carried out which demonstrate effectiveness of the proposed stopping rule for the investigated gradient methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题