分散的随机二元优化，并提高了触觉复杂性

论文标题

分散的随机二元优化，并提高了触觉复杂性

Decentralized Stochastic Bilevel Optimization with Improved per-Iteration Complexity

论文作者

Chen, Xuxing, Huang, Minhui, Ma, Shiqian, Balasubramanian, Krishnakumar

论文摘要

由于其在解决重要的机器学习问题（例如元学习，增强学习和超参数优化）方面取得了巨大成功，因此Bilevel优化最近受到了极大的关注。将单一养育问题的单一机构培训扩展到分散的设置是一种自然的概括，并且已经有一系列研究分散的双层优化算法的工作。但是，如何设计具有样品复杂性和收敛速率的分布式算法仍然未知，与SGD进行随机优化相当，同时又不直接计算确切的Hessian或Jacobian矩阵。在本文中，我们提出了这种算法。更具体地说，我们提出了一种新型的分散的随机双光线优化（DSBO）算法，该算法仅需要一阶随机甲骨文，黑森 - 矢量产物和雅各布矢量产品Oracle。我们的算法的样本复杂性与DSBO的当前最著名结果相匹配，并且我们算法的优点是它不需要估计完整的Hessian和Jacobian矩阵，从而提高了每介质的复杂性。

Bilevel optimization recently has received tremendous attention due to its great success in solving important machine learning problems like meta learning, reinforcement learning, and hyperparameter optimization. Extending single-agent training on bilevel problems to the decentralized setting is a natural generalization, and there has been a flurry of work studying decentralized bilevel optimization algorithms. However, it remains unknown how to design the distributed algorithm with sample complexity and convergence rate comparable to SGD for stochastic optimization, and at the same time without directly computing the exact Hessian or Jacobian matrices. In this paper we propose such an algorithm. More specifically, we propose a novel decentralized stochastic bilevel optimization (DSBO) algorithm that only requires first order stochastic oracle, Hessian-vector product and Jacobian-vector product oracle. The sample complexity of our algorithm matches the currently best known results for DSBO, and the advantage of our algorithm is that it does not require estimating the full Hessian and Jacobian matrices, thereby having improved per-iteration complexity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题