关于平均田间状态中两层恢复网络梯度下降训练的收敛

论文标题

关于平均田间状态中两层恢复网络梯度下降训练的收敛

On the Convergence of Gradient Descent Training for Two-layer ReLU-networks in the Mean Field Regime

论文作者

Wojtowytsch, Stephan

论文摘要

我们描述了在平均野外训练两层relu-networks中，在平均野外训练两层relu-networks训练两层恢复网络时，在平均野外训练中，具有最小贝叶斯风险的必要条件。本文将Chizat和Bach的最新结果扩展到了重新激活的网络以及没有准确实现MBR的参数的情况。该条件并不取决于参数的捕获，仅关注神经网络实现的弱收敛性，而不关注其参数分布。

We describe a necessary and sufficient condition for the convergence to minimum Bayes risk when training two-layer ReLU-networks by gradient descent in the mean field regime with omni-directional initial parameter distribution. This article extends recent results of Chizat and Bach to ReLU-activated networks and to the situation in which there are no parameters which exactly achieve MBR. The condition does not depend on the initalization of parameters and concerns only the weak convergence of the realization of the neural network, not its parameter distribution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题