从迷你到最小值优化的加速零级和一阶动量方法

论文标题

从迷你到最小值优化的加速零级和一阶动量方法

Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization

论文作者

Huang, Feihu, Gao, Shangqian, Pei, Jian, Huang, Heng

论文摘要

在本文中，我们提出了一类加速的零级和一阶动量方法，用于非convex Mini-Optimization和Minimax-Optimization。具体而言，我们提出了一种新的加速零阶动量（ACC-ZOM）方法，用于黑盒迷你优化，其中只能获得函数值。此外，我们证明我们的ACC-ZOM方法达到了$ \ tilde {o}（d^{3/4}ε^{ - 3}）$的较低查询复杂性，以找到$ o（d^{1/4} $ dible dible diale demential the $ o（d^debiale）的$ o（$ε$ stationary）$，这提高了最佳已知结果。特别是，我们的ACC-ZOM在现有的零阶随机算法中不需要大批量。同时，我们提出了一个加速的零轴最小值优化的加速零阶动量下降（ACC-ZOMDA）方法，其中只能获得函数值。我们的ACC-ZOMDA获得了$ \ tilde {o}（（（d_1+d_2）^{3/4}κ_y^{4.5}ε^{ - 3}）$的低查询复杂性，而无需大量批量的$ε$ d_1 $ d_1 $和$ d_2 $ d_2 $ d_2 $ d_2 $ d _2 $ d_2 $ d _2 $ d_2 $ d dimency and $ d_2 $ d dimency and $ d_2 $ d dimency and $ d_2 $。此外，我们提出了一种加速的一阶动量下降（ACC-MDA）方法，用于最小值优化，其明确的梯度可访问。我们的ACC-MDA实现了$ \ tilde {o}（κ_y^{4.5}ε^{ - 3}）$的低梯度复杂性，而无需大量批次来查找$ε$ - 稳定点。特别是，我们的ACC-MDA可以获得$ \ tilde {o}（κ_y^{2.5}ε^{ - 3}）$的较低梯度复杂性，并具有批量尺寸$ o（κ_y^4）$的较低的梯度复杂性，从而通过$ O（κ__y^{κ_y^{1/2}）$提高了最著名的结果。黑盒对抗性攻击对深神经网络和对逻辑回归的中毒攻击的广泛实验结果表明了我们的算法效率。

In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method for black-box mini-optimization where only function values can be obtained. Moreover, we prove that our Acc-ZOM method achieves a lower query complexity of $\tilde{O}(d^{3/4}ε^{-3})$ for finding an $ε$-stationary point, which improves the best known result by a factor of $O(d^{1/4})$ where $d$ denotes the variable dimension. In particular, our Acc-ZOM does not need large batches required in the existing zeroth-order stochastic algorithms. Meanwhile, we propose an accelerated zeroth-order momentum descent ascent (Acc-ZOMDA) method for black-box minimax optimization, where only function values can be obtained. Our Acc-ZOMDA obtains a low query complexity of $\tilde{O}((d_1+d_2)^{3/4}κ_y^{4.5}ε^{-3})$ without requiring large batches for finding an $ε$-stationary point, where $d_1$ and $d_2$ denote variable dimensions and $κ_y$ is condition number. Moreover, we propose an accelerated first-order momentum descent ascent (Acc-MDA) method for minimax optimization, whose explicit gradients are accessible. Our Acc-MDA achieves a low gradient complexity of $\tilde{O}(κ_y^{4.5}ε^{-3})$ without requiring large batches for finding an $ε$-stationary point. In particular, our Acc-MDA can obtain a lower gradient complexity of $\tilde{O}(κ_y^{2.5}ε^{-3})$ with a batch size $O(κ_y^4)$, which improves the best known result by a factor of $O(κ_y^{1/2})$. Extensive experimental results on black-box adversarial attack to deep neural networks and poisoning attack to logistic regression demonstrate efficiency of our algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题