对于非平滑结构优化的平滑过度参数求解器

论文标题

对于非平滑结构优化的平滑过度参数求解器

Smooth over-parameterized solvers for non-smooth structured optimization

论文作者

Poon, Clarice, Peyré, Gabriel

论文摘要

非平滑优化是许多成像或机器学习管道的核心成分。非平滑度对解决方案的结构约束进行编码，例如稀疏性，稀疏性，较低和尖锐的边缘。它也是定义可靠损耗函数和无尺度功能（例如方形拉索）的基础。用于处理非平滑度杠杆作用的标准方法，无论是近端分裂还是坐标下降。这些方法是有效的，但通常需要参数调整，预处理或某种支撑修剪。在这项工作中，我们主张并研究了一条不同的途径，该途径运行了非凸面但平稳的过度参数化，对基本的非平滑优化问题进行了过度参数。这概括了流行迭代重新加权最小二乘（IRLS）的核心的二次变异形式。我们的主要理论贡献将这种重新印度的梯度下降与不同的黑森度量标准的镜下下降联系在一起。该分析对于得出无维度的收敛范围至关重要。这解释了该方法在成像中使用小网格大小时的效率。我们的主要算法贡献是应用变量投影（VARPRO）方法，该方法通过在变量的一部分上显式最小化来定义新的公式。这导致了最小化功能的更好条件，并改善了简单但非常有效的基于梯度的方法的收敛性，例如准Newton求解器。我们举例说明了该新求解器用于解决逆问题和监督学习（包括总差异先验和非凸起的正规机）的正规化回归问题的使用。

Non-smooth optimization is a core ingredient of many imaging or machine learning pipelines. Non-smoothness encodes structural constraints on the solutions, such as sparsity, group sparsity, low-rank and sharp edges. It is also the basis for the definition of robust loss functions and scale-free functionals such as square-root Lasso. Standard approaches to deal with non-smoothness leverage either proximal splitting or coordinate descent. These approaches are effective but usually require parameter tuning, preconditioning or some sort of support pruning. In this work, we advocate and study a different route, which operates a non-convex but smooth over-parametrization of the underlying non-smooth optimization problems. This generalizes quadratic variational forms that are at the heart of the popular Iterative Reweighted Least Squares (IRLS). Our main theoretical contribution connects gradient descent on this reformulation to a mirror descent flow with a varying Hessian metric. This analysis is crucial to derive convergence bounds that are dimension-free. This explains the efficiency of the method when using small grid sizes in imaging. Our main algorithmic contribution is to apply the Variable Projection (VarPro) method which defines a new formulation by explicitly minimizing over part of the variables. This leads to a better conditioning of the minimized functional and improves the convergence of simple but very efficient gradient-based methods, for instance quasi-Newton solvers. We exemplify the use of this new solver for the resolution of regularized regression problems for inverse problems and supervised learning, including total variation prior and non-convex regularizers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题