论文标题

关于浅单变量relu网络中有效数量的线性区域:融合保证和隐性偏见

On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias

论文作者

Safran, Itay, Vardi, Gal, Lee, Jason D.

论文摘要

我们研究梯度流动流(GF)在单变量relu神经网络上具有单个隐藏层在二元分类设置中的动力学和隐性偏差。我们表明,当标签取决于具有$ r $神经元的目标网络的迹象时,对于网络的初始化以及数据集的采样可能性很高,GF朝向方向(适当定义)收集到一个实现完美训练准确性的网络,并最多可以$ \ mathcal {o}(o}(o}(r)$ lineArearions,构成一般性限制。与文献中的许多其他结果不同,在对数据分布的额外假设下,我们的结果甚至适合轻度过度参数化,其中宽度为$ \ tilde {\ Mathcal {o}}}(r)$,并且独立于样本大小。

We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural networks with a single hidden layer in a binary classification setting. We show that when the labels are determined by the sign of a target network with $r$ neurons, with high probability over the initialization of the network and the sampling of the dataset, GF converges in direction (suitably defined) to a network achieving perfect training accuracy and having at most $\mathcal{O}(r)$ linear regions, implying a generalization bound. Unlike many other results in the literature, under an additional assumption on the distribution of the data, our result holds even for mild over-parameterization, where the width is $\tilde{\mathcal{O}}(r)$ and independent of the sample size.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源