有限-SUM优化：融合全球解决方案的新观点

论文标题

有限-SUM优化：融合全球解决方案的新观点

Finite-Sum Optimization: A New Perspective for Convergence to a Global Solution

论文作者

Nguyen, Lam M., Tran, Trang H., van Dijk, Marten

论文摘要

深度神经网络（DNN）在许多机器学习任务中都表现出了巨大的成功。他们的训练具有挑战性，因为网络体系结构的损失表面通常是非凸面，甚至是不平滑的。如何保证与\ textIt {global}最低限度合理的假设和下降？我们提出了对最小化问题的重新重新制定，允许建立新的递归算法框架。通过使用有界的样式假设，我们证明使用$ \ Mathcal {\ tilde {o}}（1/\ Varepsilon^3）$渐变计算，证明了融合到$ \ varepsilon $ - （全局）最小值。我们的理论基础激发了对新算法框架的进一步研究，实施和优化，并进一步研究其非标准界风格假设。这个新的方向扩大了我们对为什么和在什么情况下培训DNN的理解，将DNN汇聚到全球最低限度。

Deep neural networks (DNNs) have shown great success in many machine learning tasks. Their training is challenging since the loss surface of the network architecture is generally non-convex, or even non-smooth. How and under what assumptions is guaranteed convergence to a \textit{global} minimum possible? We propose a reformulation of the minimization problem allowing for a new recursive algorithmic framework. By using bounded style assumptions, we prove convergence to an $\varepsilon$-(global) minimum using $\mathcal{\tilde{O}}(1/\varepsilon^3)$ gradient computations. Our theoretical foundation motivates further study, implementation, and optimization of the new algorithmic framework and further investigation of its non-standard bounded style assumptions. This new direction broadens our understanding of why and under what circumstances training of a DNN converges to a global minimum.

下载PDF全文

下载文献需遵守相关版权规定

论文标题