非凸优化的随机梯度下降方法的局部收敛理论，非局部局部最小值

论文标题

非凸优化的随机梯度下降方法的局部收敛理论，非局部局部最小值

A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima

论文作者

Ko, Taehee, Li, Xiantao

论文摘要

在几个机器学习问题中出现了具有非分离最小值的损失功能，在理论和实践之间造成了差距。在本文中，我们制定了一种新型的局部凸状条件，该条件适合描述非分离最小值附近损失函数的行为。我们表明，这种情况足以涵盖许多现有条件。此外，我们通过采用随机稳定性的概念来研究在这种温和条件下SGD的局部收敛。收敛分析所产生的相应浓度不平等有助于解释一些实际培训结果的经验观察。

Loss functions with non-isolated minima have emerged in several machine learning problems, creating a gap between theory and practice. In this paper, we formulate a new type of local convexity condition that is suitable to describe the behavior of loss functions near non-isolated minima. We show that such condition is general enough to encompass many existing conditions. In addition we study the local convergence of the SGD under this mild condition by adopting the notion of stochastic stability. The corresponding concentration inequalities from the convergence analysis help to interpret the empirical observation from some practical training results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题