通过自治的融合保证，采用非常快速的双层优化

论文标题

通过自治的融合保证，采用非常快速的双层优化

Towards Extremely Fast Bilevel Optimization with Self-governed Convergence Guarantees

论文作者

Liu, Risheng, Liu, Xuan, Yao, Wei, Zeng, Shangzhi, Zhang, Jin

论文摘要

梯度方法已成为学习和视觉领域中双层优化（BLO）的主流技术。现有作品的有效性在很大程度上依赖于解决一系列具有非常高精度的近似子问题。不幸的是，要达到近似准确性，就需要执行大量的时间效率迭代和计算负担自然是造成的。因此，本文致力于解决这个关键的计算问题。特别是，我们提出了一个单层公式，以统一了解现有的显式和隐式基于梯度的BLO（GBLO）。这与我们设计的反示例一起可以清楚地说明GBLO及其幼稚加速的基本数值和理论问题。通过将双重乘数作为新变量引入，然后我们建立了双重校正（BAGDC）（一个通用框架）的二元梯度，该梯度通过采用特定设置来大大加速现有方法的不同类别。我们的收敛结果的一个惊人特征是，与那些原始的未加密的GBLO版本相比，快速BAGDC承认了统一的非反应收敛理论对平稳性。还进行了多种数值实验，以证明所提出的算法框架的优越性。

Gradient methods have become mainstream techniques for Bi-Level Optimization (BLO) in learning and vision fields. The validity of existing works heavily relies on solving a series of approximation subproblems with extraordinarily high accuracy. Unfortunately, to achieve the approximation accuracy requires executing a large quantity of time-consuming iterations and computational burden is naturally caused. This paper is thus devoted to address this critical computational issue. In particular, we propose a single-level formulation to uniformly understand existing explicit and implicit Gradient-based BLOs (GBLOs). This together with our designed counter-example can clearly illustrate the fundamental numerical and theoretical issues of GBLOs and their naive accelerations. By introducing the dual multipliers as a new variable, we then establish Bilevel Alternating Gradient with Dual Correction (BAGDC), a general framework, which significantly accelerates different categories of existing methods by taking specific settings. A striking feature of our convergence result is that, compared to those original unaccelerated GBLO versions, the fast BAGDC admits a unified non-asymptotic convergence theory towards stationarity. A variety of numerical experiments have also been conducted to demonstrate the superiority of the proposed algorithmic framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题