论文标题
具有动量的新加速随机梯度法
A New Accelerated Stochastic Gradient Method with Momentum
论文作者
论文摘要
在本文中,我们提出了一种具有动量的新型加速随机梯度方法,这是先前梯度的加权平均值。权重与迭代时间成比例地衰减。具有动量(SGDM)的随机梯度下降,使用迭代时间呈指数衰减的权重来生成动量项。使用指数衰减的权重,已经提出了具有良好设计且复杂格式的SGDM的变体以实现更好的性能。我们方法的动量更新规则与SGDM一样简单。我们为我们的方法提供了理论收敛属性分析,这些属性既显示了指数的衰减权重和我们的反相反衰减权重可以限制要优化到区域的参数移动方向的方差。实验结果从经验上表明,我们的方法在实际问题和表现优于SGDM方面很好地效果,并且在卷积神经网络中的表现优于亚当。
In this paper, we propose a novel accelerated stochastic gradient method with momentum, which momentum is the weighted average of previous gradients. The weights decays inverse proportionally with the iteration times. Stochastic gradient descent with momentum (Sgdm) use weights that decays exponentially with the iteration times to generate an momentum term. Using exponentially decaying weights, variants of Sgdm with well designed and complicated formats have been proposed to achieve better performance. The momentum update rules of our method is as simple as that of Sgdm. We provide theoretical convergence properties analyses for our method, which show both the exponentially decay weights and our inverse proportionally decay weights can limit the variance of the moving direction of parameters to be optimized to a region. Experimental results empirically show that our method works well with practical problems and outperforms Sgdm, and it outperforms Adam in convolutional neural networks.