论文标题
具有几乎无尺寸收敛速率保证的通用黑盒优化方法
A universal black-box optimization method with almost dimension-free convergence rate guarantees
论文作者
论文摘要
优化的通用方法旨在实现理论上最佳的收敛速率,而没有任何事先了解问题的规律性参数或优化器使用的梯度甲骨文的准确性。 In this regard, existing state-of-the-art algorithms achieve an $\mathcal{O}(1/T^2)$ value convergence rate in Lipschitz smooth problems with a perfect gradient oracle, and an $\mathcal{O}(1/\sqrt{T})$ convergence rate when the underlying problem is non-smooth and/or the gradient oracle is stochastic.不利的一面是,这些方法没有考虑到问题的维度,这在理论和实践中都可能对所达到的收敛率产生灾难性的影响。我们的论文旨在通过提供可扩展的通用梯度方法(称为本科生)来弥合这一差距,其在有利的几何形状的问题中几乎没有尺寸(如单纯形,线性约束的半决赛程序和组合群),同时保留上述$ t $的订单依赖性。这些“最佳世界世界”的结果是通过受差异不平等的双重探索方法启发的原始偶对偶更新方案实现的。
Universal methods for optimization are designed to achieve theoretically optimal convergence rates without any prior knowledge of the problem's regularity parameters or the accurarcy of the gradient oracle employed by the optimizer. In this regard, existing state-of-the-art algorithms achieve an $\mathcal{O}(1/T^2)$ value convergence rate in Lipschitz smooth problems with a perfect gradient oracle, and an $\mathcal{O}(1/\sqrt{T})$ convergence rate when the underlying problem is non-smooth and/or the gradient oracle is stochastic. On the downside, these methods do not take into account the problem's dimensionality, and this can have a catastrophic impact on the achieved convergence rate, in both theory and practice. Our paper aims to bridge this gap by providing a scalable universal gradient method - dubbed UnderGrad - whose oracle complexity is almost dimension-free in problems with a favorable geometry (like the simplex, linearly constrained semidefinite programs and combinatorial bandits), while retaining the order-optimal dependence on $T$ described above. These "best-of-both-worlds" results are achieved via a primal-dual update scheme inspired by the dual exploration method for variational inequalities.