论文标题

在两个神经网络和学习的稳定性之间

On the distance between two neural networks and the stability of learning

论文作者

Bernstein, Jeremy, Vahdat, Arash, Yue, Yisong, Liu, Ming-Yu

论文摘要

本文将参数距离与一类广泛的非线性组成函数的梯度分解相关联。该分析导致了一种新的距离函数,称为“深度相对信任”和神经网络的下降引理。由于所得的学习规则似乎几乎不需要学习率调整,因此它可能会解锁更简单的工作流程,以训练更深入,更复杂的神经网络。本文中使用的Python代码在这里:https://github.com/jxbz/fromage。

This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions. The analysis leads to a new distance function called deep relative trust and a descent lemma for neural networks. Since the resulting learning rule seems to require little to no learning rate tuning, it may unlock a simpler workflow for training deeper and more complex neural networks. The Python code used in this paper is here: https://github.com/jxbz/fromage.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源