论文标题

在宽两层神经网络的动力学中的对称性上

On the symmetries in the dynamics of wide two-layer neural networks

论文作者

Hajjar, Karl, Chizat, Lenaic

论文摘要

我们考虑了梯度流对人口的理想化设置对无限宽的两层relu神经网络(无偏见)的风险,并研究对称性对学习参数和预测因子的影响。我们首先描述了一个通用类的对称性类别,当目标函数$ f^*$和输入分布满足时,动态保留了。然后,我们研究更多的特定情况。当$ f^*$很奇怪时,我们表明预测变量的动力学将减少到(非线性参数化)线性预测变量,并且可以保证其指数收敛。当$ f^*$具有低维结构时,我们证明梯度流PDE会降低到较低维的PDE。此外,我们提出了非正式和数值论点,表明输入神经元与问题的较低维结构保持一致。

We consider the idealized setting of gradient flow on the population risk for infinitely wide two-layer ReLU neural networks (without bias), and study the effect of symmetries on the learned parameters and predictors. We first describe a general class of symmetries which, when satisfied by the target function $f^*$ and the input distribution, are preserved by the dynamics. We then study more specific cases. When $f^*$ is odd, we show that the dynamics of the predictor reduces to that of a (non-linearly parameterized) linear predictor, and its exponential convergence can be guaranteed. When $f^*$ has a low-dimensional structure, we prove that the gradient flow PDE reduces to a lower-dimensional PDE. Furthermore, we present informal and numerical arguments that suggest that the input neurons align with the lower-dimensional structure of the problem.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源