论文标题
确定性和随机非平滑非凸优化的无梯度方法
Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization
论文作者
论文摘要
非滑动非凸优化问题在机器学习和业务决策中广泛出现,而两个核心挑战阻碍了具有有限的时间收敛保证的有效解决方案方法的开发:缺乏计算上可触及的最佳标准和缺乏计算功能强大的口径。本文的贡献是两个方面。首先,我们建立了著名的Goldstein细分〜\ citep {Goldstein-1977-Optimization}与均匀的平滑之间的关系,从而为设计有限时间融合到一组Goldstein Stisenary Stistary Points的无梯度方法的基础和直觉提供了基础和直觉。其次,我们提出了无梯度的方法(GFM)和随机GFM,用于求解一类非平滑非凸的非convex优化问题,并证明他们两个都可以返回$(δ,ε)$ - lipschitz $ f $ f $的goldstein nistary $ f $ f $以$ o(d^$ o(d^$ o), $ D $是问题维度。还提出了两阶段版本的GFM和SGFM,并证明并证明可以提高大泄漏结果。最后,我们证明了2-SGFM使用\ textsc {minst}数据集对培训relu神经网络的有效性。
Nonsmooth nonconvex optimization problems broadly emerge in machine learning and business decision making, whereas two core challenges impede the development of efficient solution methods with finite-time convergence guarantee: the lack of computationally tractable optimality criterion and the lack of computationally powerful oracles. The contributions of this paper are two-fold. First, we establish the relationship between the celebrated Goldstein subdifferential~\citep{Goldstein-1977-Optimization} and uniform smoothing, thereby providing the basis and intuition for the design of gradient-free methods that guarantee the finite-time convergence to a set of Goldstein stationary points. Second, we propose the gradient-free method (GFM) and stochastic GFM for solving a class of nonsmooth nonconvex optimization problems and prove that both of them can return a $(δ,ε)$-Goldstein stationary point of a Lipschitz function $f$ at an expected convergence rate at $O(d^{3/2}δ^{-1}ε^{-4})$ where $d$ is the problem dimension. Two-phase versions of GFM and SGFM are also proposed and proven to achieve improved large-deviation results. Finally, we demonstrate the effectiveness of 2-SGFM on training ReLU neural networks with the \textsc{Minst} dataset.