论文标题
敏感性 - 局部控制混沌性或全球梯度的指数 -
Sensitivity - Local Index to Control Chaoticity or Gradient Globally -
论文作者
论文摘要
在这里,我们在神经网络(NN)中介绍了每个神经元的全部局部索引,以控制每个神经元的“灵敏度”,以控制全球的混沌性或梯度。我们还提出了一种学习方法,以调整名为“灵敏度调整学习(SAL)”。索引是其输入相对于其输入的梯度大小。通过将其时间的平均时间调整为每个神经元中的1.0,神经元中的信息传输变化将在不缩小或扩展前向前和向后计算而变化。当权重和输入是随机的时,通过一层神经元导致中等信息传输。因此,SAL可以控制复发NN(RNN)中网络动力学的混沌性。它还可以解决在深度喂养NN或RNN中学习错误反向传播(BP)中消失的梯度问题。我们证明,当将SAL应用于具有较小和随机的初始权重的RNN时,对数 - 敏感性是所有神经元对RMS(根平方)灵敏度的对数,等效于最大Lyapunov指数,直到达到0.0。我们还表明,SAL与BP或BPTT(BP通过时间)一起使用,以避免在300层NN或RNN中避免消失的梯度问题,该问题在第一个输入和输出之间学习了一个滞后300步的问题。与在学习之前手动微调重量矩阵的光谱半径相比,SAL的连续非线性学习性质可以防止学习过程中的敏感性丧失,从而显着改善学习表现。
Here, we introduce a fully local index named "sensitivity" for each neuron to control chaoticity or gradient globally in a neural network (NN). We also propose a learning method to adjust it named "sensitivity adjustment learning (SAL)". The index is the gradient magnitude of its output with respect to its inputs. By adjusting its time average to 1.0 in each neuron, information transmission in the neuron changes to be moderate without shrinking or expanding for both forward and backward computations. That results in moderate information transmission through a layer of neurons when the weights and inputs are random. Therefore, SAL can control the chaoticity of the network dynamics in a recurrent NN (RNN). It can also solve the vanishing gradient problem in error backpropagation (BP) learning in a deep feedforward NN or an RNN. We demonstrate that when applying SAL to an RNN with small and random initial weights, log-sensitivity, which is the logarithm of RMS (root mean square) sensitivity over all the neurons, is equivalent to the maximum Lyapunov exponent until it reaches 0.0. We also show that SAL works with BP or BPTT (BP through time) to avoid the vanishing gradient problem in a 300-layer NN or an RNN that learns a problem with a lag of 300 steps between the first input and the output. Compared with manually fine-tuning the spectral radius of the weight matrix before learning, SAL's continuous nonlinear learning nature prevents loss of sensitivities during learning, resulting in a significant improvement in learning performance.