论文标题
关于浅单变量relu网络中有效数量的线性区域:融合保证和隐性偏见
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
论文作者
论文摘要
我们研究梯度流动流(GF)在单变量relu神经网络上具有单个隐藏层在二元分类设置中的动力学和隐性偏差。我们表明,当标签取决于具有$ r $神经元的目标网络的迹象时,对于网络的初始化以及数据集的采样可能性很高,GF朝向方向(适当定义)收集到一个实现完美训练准确性的网络,并最多可以$ \ mathcal {o}(o}(o}(r)$ lineArearions,构成一般性限制。与文献中的许多其他结果不同,在对数据分布的额外假设下,我们的结果甚至适合轻度过度参数化,其中宽度为$ \ tilde {\ Mathcal {o}}}(r)$,并且独立于样本大小。
We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural networks with a single hidden layer in a binary classification setting. We show that when the labels are determined by the sign of a target network with $r$ neurons, with high probability over the initialization of the network and the sampling of the dataset, GF converges in direction (suitably defined) to a network achieving perfect training accuracy and having at most $\mathcal{O}(r)$ linear regions, implying a generalization bound. Unlike many other results in the literature, under an additional assumption on the distribution of the data, our result holds even for mild over-parameterization, where the width is $\tilde{\mathcal{O}}(r)$ and independent of the sample size.