关于浅单变量relu网络中有效数量的线性区域：融合保证和隐性偏见

论文标题

关于浅单变量relu网络中有效数量的线性区域：融合保证和隐性偏见

On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias

论文作者

Safran, Itay, Vardi, Gal, Lee, Jason D.

论文摘要

我们研究梯度流动流（GF）在单变量relu神经网络上具有单个隐藏层在二元分类设置中的动力学和隐性偏差。我们表明，当标签取决于具有$ r $神经元的目标网络的迹象时，对于网络的初始化以及数据集的采样可能性很高，GF朝向方向（适当定义）收集到一个实现完美训练准确性的网络，并最多可以$ \ mathcal {o}（o}（o}（r）$ lineArearions，构成一般性限制。与文献中的许多其他结果不同，在对数据分布的额外假设下，我们的结果甚至适合轻度过度参数化，其中宽度为$ \ tilde {\ Mathcal {o}}}（r）$，并且独立于样本大小。

We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural networks with a single hidden layer in a binary classification setting. We show that when the labels are determined by the sign of a target network with $r$ neurons, with high probability over the initialization of the network and the sampling of the dataset, GF converges in direction (suitably defined) to a network achieving perfect training accuracy and having at most $\mathcal{O}(r)$ linear regions, implying a generalization bound. Unlike many other results in the literature, under an additional assumption on the distribution of the data, our result holds even for mild over-parameterization, where the width is $\tilde{\mathcal{O}}(r)$ and independent of the sample size.

下载PDF全文

下载文献需遵守相关版权规定

论文标题