论文标题
懒惰与仓促:深网中的线性化会影响基于示例难度的学习时间表
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
论文作者
论文摘要
在对深神经网络成功的理论上说明的尝试中,最近的一项工作确定了所谓的懒惰训练制度,在该制度中,网络可以通过其围绕初始化的线性化来很好地近似。在这里,我们根据其难度研究了示例子组对懒惰(线性)和特征学习(非线性)制度的比较效应。具体而言,我们表明,在功能学习模式下,给予更容易的示例,与更困难的训练相比,训练更快。换句话说,非线性动力学倾向于将增加难度的示例序列学习。我们在不同的方式上说明了这种现象,以量化示例难度,包括C得分,标签噪声,以及在易于学习的伪造相关性的情况下。我们的结果揭示了对深度网络在示例难度范围内如何优先级资源的新理解。
Among attempts at giving a theoretical account of the success of deep neural networks, a recent line of work has identified a so-called lazy training regime in which the network can be well approximated by its linearization around initialization. Here we investigate the comparative effect of the lazy (linear) and feature learning (non-linear) regimes on subgroups of examples based on their difficulty. Specifically, we show that easier examples are given more weight in feature learning mode, resulting in faster training compared to more difficult ones. In other words, the non-linear dynamics tends to sequentialize the learning of examples of increasing difficulty. We illustrate this phenomenon across different ways to quantify example difficulty, including c-score, label noise, and in the presence of easy-to-learn spurious correlations. Our results reveal a new understanding of how deep networks prioritize resources across example difficulty.