论文标题
一种基于共识的全局优化方法,具有自适应动量估计
A consensus-based global optimization method with adaptive momentum estimation
论文作者
论文摘要
大规模的机器学习和人工智能应用中的目标功能通常生活在高尺寸的高尺寸,具有强大的非凸度和巨大的局部最小值。一阶方法,例如随机梯度方法和ADAM,通常用于查找全局最小值。最近,基于共识的优化(CBO)方法已被引入为无梯度优化方法之一,其收敛性通过维度依赖性参数证明,这可能会受到维数的诅咒。通过用组件替换各向同性的几何布朗运动,CBO方法的最新改进可以保证以无独立的参数收敛到全局最小化器,尽管初始数据需要很好地选择。在本文中,基于CBO方法和ADAM,我们提出了一种基于自适应动量估计(ADAM-CBO)的基于共识的全局优化方法。 Adam-CBO方法的优点包括:(1)能够找到具有高成功率和低成本的非凸目标功能的全球最小值; (2)可以处理非差异的激活函数,从而具有更好的准确性近似低规度函数。通过近似于$ 1000 $的尺寸rastrigin功能,以$ 100 \%$的成功率以仅相对于维度线性增长的成本来验证。通过求解具有低常规解决方案的部分微分方程的机器学习任务,可以确认后者,而ADAM-CBO方法比最先进的方法ADAM提供了更好的结果。提供了线性稳定性分析,以了解ADAM-CBO方法的渐近行为。
Objective functions in large-scale machine-learning and artificial intelligence applications often live in high dimensions with strong non-convexity and massive local minima. First-order methods, such as the stochastic gradient method and Adam, are often used to find global minima. Recently, the consensus-based optimization (CBO) method has been introduced as one of the gradient-free optimization methods and its convergence is proven with dimension-dependent parameters, which may suffer from the curse of dimensionality. By replacing the isotropic geometric Brownian motion with the component-wise one, the latest improvement of the CBO method is guaranteed to converge to the global minimizer with dimension-independent parameters, although the initial data need to be well-chosen. In this paper, based on the CBO method and Adam, we propose a consensus-based global optimization method with adaptive momentum estimation (Adam-CBO). Advantages of the Adam-CBO method include: (1) capable of finding global minima of non-convex objective functions with high success rates and low costs; (2) can handle non-differentiable activation functions and thus approximate low-regularity functions with better accuracy. The former is verified by approximating the $1000$ dimensional Rastrigin function with $100\%$ success rate at a cost only growing linearly with respect to the dimensionality. The latter is confirmed by solving a machine learning task for partial differential equations with low-regularity solutions where the Adam-CBO method provides better results than the state-of-the-art method Adam. A linear stability analysis is provided to understand the asymptotic behavior of the Adam-CBO method.