$ \ ell_q $ -NOMM复合优化问题的正规牛顿方法

论文标题

$ \ ell_q $ -NOMM复合优化问题的正规牛顿方法

A regularized Newton method for $\ell_q$-norm composite optimization problems

论文作者

Wu, Yuqia, Pan, Shaohua, Yang, Xiaoqi

论文摘要

本文涉及$ \ ell_q \，（0 <q <1）$ - 规范正规化最小化问题，并连续两次可区分的损失函数。对于这类非凸和非平滑复合问题，已经提出了许多算法来解决它们，其中大多数是一阶类型的。在这项工作中，我们提出了近端梯度方法和名为HPGSRN的子空间正规化牛顿方法的杂种。事实证明，HPGSRN生产的整个迭代顺序具有有限的长度，并在轻度的曲线比率下汇聚到$ L $ type的固定点以及成本函数的kurdyka-lojasiewicz财产，如果进一步的库尔迪卡 - kurdyka-lojoujasiewicz属性，则可以线性地持续使用。此外，在额外的局部误差结合条件下，还达到了迭代序列的超线性收敛速率。我们的收敛结果不需要$ l $ - 定位点的孤立性和严格的局部最小属性。与Zerofpr的数值比较，Zerofpr是近端梯度方法和准牛顿法的混合体，用于成本函数的前向信封，在[A. Themelis，L。Stella和P. Patrinos，{\ em Siam J. Optim。，} 28（2018），第2274-2303页，对于$ \ ell_q $ -norm对实际数据的正则线性和逻辑回归的实现，HPGSRN的计算时间不仅需要更少的计算时间，而且还能提供更好的明显功能，并产生了更高的计算时间。

This paper is concerned with $\ell_q\,(0<q<1)$-norm regularized minimization problems with a twice continuously differentiable loss function. For this class of nonconvex and nonsmooth composite problems, many algorithms have been proposed to solve them and most of which are of the first-order type. In this work, we propose a hybrid of proximal gradient method and subspace regularized Newton method, named HpgSRN. The whole iterate sequence produced by HpgSRN is proved to have a finite length and converge to an $L$-type stationary point under a mild curve-ratio condition and the Kurdyka-Łojasiewicz property of the cost function, which does linearly if further a Kurdyka-Łojasiewicz property of exponent $1/2$ holds. Moreover, a superlinear convergence rate for the iterate sequence is also achieved under an additional local error bound condition. Our convergence results do not require the isolatedness and strict local minimality properties of the $L$-stationary point. Numerical comparisons with ZeroFPR, a hybrid of proximal gradient method and quasi-Newton method for the forward-backward envelope of the cost function, proposed in [A. Themelis, L. Stella, and P. Patrinos, {\em SIAM J. Optim., } 28(2018), pp. 2274-2303] for the $\ell_q$-norm regularized linear and logistic regressions on real data indicate that HpgSRN not only requires much less computing time but also yields comparable even better sparsities and objective function values.

下载PDF全文

下载文献需遵守相关版权规定

论文标题