超越信号传播：深度神经网络初始化所需的特征多样性？

论文标题

超越信号传播：深度神经网络初始化所需的特征多样性？

Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?

论文作者

Blumenfeld, Yaniv, Gilboa, Dar, Soudry, Daniel

论文摘要

深度神经网络通常以随机重量初始化，并选择方差以促进信号传播和稳定的梯度。还认为，特征的多样性是这些初始化的重要特性。我们通过将几乎所有权重的初始化为$ 0 $来构建具有相同功能的深卷积网络。该体系结构还可以实现完美的信号传播和稳定的梯度，并在标准基准测试方面具有很高的精度。这表明训练神经网络所需的随机，多样化的初始化是\ textit {not}。训练该网络的基本要素是对称性破坏的机制。我们研究了这一现象，发现非确定性的标准GPU操作可以作为对称破坏的足够来源，以实现训练。

Deep neural networks are typically initialized with random weights, with variances chosen to facilitate signal propagation and stable gradients. It is also believed that diversity of features is an important property of these initializations. We construct a deep convolutional network with identical features by initializing almost all the weights to $0$. The architecture also enables perfect signal propagation and stable gradients, and achieves high accuracy on standard benchmarks. This indicates that random, diverse initializations are \textit{not} necessary for training neural networks. An essential element in training this network is a mechanism of symmetry breaking; we study this phenomenon and find that standard GPU operations, which are non-deterministic, can serve as a sufficient source of symmetry breaking to enable training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题