持续的神经元

论文标题

持续的神经元

Persistent Neurons

论文作者

Min, Yimeng

论文摘要

基于神经网络（NN）的学习算法受到初始化和数据分布的选择强烈影响。已经提出了不同的优化策略，以改善学习轨迹并找到更好的最佳功能。但是，在常规景观视图下，设计改进的优化策略是一项艰巨的任务。在这里，我们提出了持久性神经元，这是一种基于轨迹的策略，该策略使用以前的收敛解决方案中的信息来优化学习任务。更确切地说，我们利用轨迹的末端，让参数通过在相同初始化下从收敛到先前的解决方案来惩罚模型来探索新的景观。持续的神经元可以被视为一种随机梯度方法，具有知情偏见，在确定性错误术语中，单个更新被损坏。具体而言，我们表明，在某些数据分布下，持久性神经元能够收敛到更最佳的解决方案，而在流行框架下的初始化发现了不良的局部最小值。我们进一步证明，持续的神经元有助于在良好和差的初始化下改善模型的性能。我们评估了完整的部分持久模型，并表明它可以用于提高NN结构（例如Alexnet和残留神经网络（RESNET））的性能。

Neural networks (NN)-based learning algorithms are strongly affected by the choices of initialization and data distribution. Different optimization strategies have been proposed for improving the learning trajectory and finding a better optima. However, designing improved optimization strategies is a difficult task under the conventional landscape view. Here, we propose persistent neurons, a trajectory-based strategy that optimizes the learning task using information from previous converged solutions. More precisely, we utilize the end of trajectories and let the parameters explore new landscapes by penalizing the model from converging to the previous solutions under the same initialization. Persistent neurons can be regarded as a stochastic gradient method with informed bias where individual updates are corrupted by deterministic error terms. Specifically, we show that persistent neurons, under certain data distribution, is able to converge to more optimal solutions while initializations under popular framework find bad local minima. We further demonstrate that persistent neurons helps improve the model's performance under both good and poor initializations. We evaluate the full and partial persistent model and show it can be used to boost the performance on a range of NN structures, such as AlexNet and residual neural network (ResNet).

下载PDF全文

下载文献需遵守相关版权规定

论文标题