套索型模型的隐式分化用于超参数优化

论文标题

套索型模型的隐式分化用于超参数优化

Implicit differentiation of Lasso-type models for hyperparameter optimization

论文作者

Bertrand, Quentin, Klopfenstein, Quentin, Blondel, Mathieu, Vaiter, Samuel, Gramfort, Alexandre, Salmon, Joseph

论文摘要

众所周知，设定套管型估计器的正则化参数是困难的，尽管在实践中至关重要。最受欢迎的超参数优化方法是使用固定验证数据进行网格搜索。但是，网格搜索需要为每个参数选择一个预定义的网格，该网格在参数数中呈指数缩放。另一种方法是将超参数优化作为双层优化问题，可以通过梯度下降来解决。这些方法的主要挑战是相对于超参数的梯度估计。通过向前或向后自动分化计算此梯度是可能的，但通常会遭受高内存消耗的影响。另外，隐式分化通常涉及求解线性系统，该系统在高维度上可能是过于效率且数值不稳定的。此外，隐式分化通常具有平滑的损失功能，而套索型问题并非如此。这项工作引入了有效的隐式分化算法，而没有基质反转，该算法是针对套索型问题量身定制的。我们的方法通过利用解决方案的稀疏性来缩放高维数据。实验表明，所提出的方法的表现优于大量标准方法，以优化持有数据或Stein无偏风险估计器（当然）。

Setting regularization parameters for Lasso-type estimators is notoriously difficult, though crucial in practice. The most popular hyperparameter optimization approach is grid-search using held-out validation data. Grid-search however requires to choose a predefined grid for each parameter, which scales exponentially in the number of parameters. Another approach is to cast hyperparameter optimization as a bi-level optimization problem, one can solve by gradient descent. The key challenge for these methods is the estimation of the gradient with respect to the hyperparameters. Computing this gradient via forward or backward automatic differentiation is possible yet usually suffers from high memory consumption. Alternatively implicit differentiation typically involves solving a linear system which can be prohibitive and numerically unstable in high dimension. In addition, implicit differentiation usually assumes smooth loss functions, which is not the case for Lasso-type problems. This work introduces an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems. Our approach scales to high-dimensional data by leveraging the sparsity of the solutions. Experiments demonstrate that the proposed method outperforms a large number of standard methods to optimize the error on held-out data, or the Stein Unbiased Risk Estimator (SURE).

下载PDF全文

下载文献需遵守相关版权规定

论文标题