论文标题
NN的学习阶段:从适合大多数到适合一些
The learning phases in NN: From Fitting the Majority to Fitting a Few
论文作者
论文摘要
深度神经网络的学习动力可能会引起争议。已经提出了使用信息瓶颈(IB)理论单独的拟合和压缩阶段,但此后一直在辩论。我们通过基于训练过程中参数的演变来分析一层输入和预测性能的重建能力来学习学习动力学。我们表明,原型阶段最初降低了重建损失,然后减少了一些样本的分类损失,从而增加了重建损失,这是在对数据的轻度假设下存在的。除了提供单层分类网络的数学分析外,我们还使用来自Resnet和VGG等计算机视觉的常见数据集和体系结构来评估行为。
The learning dynamics of deep neural networks are subject to controversy. Using the information bottleneck (IB) theory separate fitting and compression phases have been put forward but have since been heavily debated. We approach learning dynamics by analyzing a layer's reconstruction ability of the input and prediction performance based on the evolution of parameters during training. We show that a prototyping phase decreasing reconstruction loss initially, followed by reducing classification loss of a few samples, which increases reconstruction loss, exists under mild assumptions on the data. Aside from providing a mathematical analysis of single layer classification networks, we also assess the behavior using common datasets and architectures from computer vision such as ResNet and VGG.