论文标题
使用均值 - 场类型控制和训练算法的非线性稳定性对深度学习的多核表述
Multisymplectic Formulation of Deep Learning Using Mean--Field Type Control and Nonlinear Stability of Training Algorithm
论文作者
论文摘要
就目前而言,一个强大的数学框架来分析和研究深度学习中的各种主题尚未脱颖而出。但是,将深度学习视为动力学系统,允许使用既定理论来研究深神网络的行为。为了研究训练过程的稳定性,在本文中,我们将深层神经网络的训练作为流体动力学系统,该系统具有多链结构。为此,深度神经网络是使用随机微分方程对其进行建模的,从而使用平均场类型控制来训练它。均值场类型控制的最佳条件的必要条件将其降低到欧拉峰方程的系统,该系统具有与可压缩流体相似的几何结构。平均场类型控件使用利用基础几何形状的多链数数值方案进行数值求解。此外,数值方案产生了一个近似的解决方案,该解决方案也是具有多透明结构的流体动力系统的精确解决方案,可以使用向后误差分析对其进行分析。此外,非线性稳定性产生了选择隐藏层数和每层节点数量的条件,这使得训练稳定,同时近似于残留的神经网络的解决方案,其中许多隐藏的图层接近无限。
As it stands, a robust mathematical framework to analyse and study various topics in deep learning is yet to come to the fore. Nonetheless, viewing deep learning as a dynamical system allows the use of established theories to investigate the behaviour of deep neural networks. In order to study the stability of the training process, in this article, we formulate training of deep neural networks as a hydrodynamics system, which has a multisymplectic structure. For that, the deep neural network is modelled using a stochastic differential equation and, thereby, mean-field type control is used to train it. The necessary conditions for optimality of the mean--field type control reduce to a system of Euler-Poincare equations, which has the a similar geometric structure to that of compressible fluids. The mean-field type control is solved numerically using a multisymplectic numerical scheme that takes advantage of the underlying geometry. Moreover, the numerical scheme, yields an approximated solution which is also an exact solution of a hydrodynamics system with a multisymplectic structure and it can be analysed using backward error analysis. Further, nonlinear stability yields the condition for selecting the number of hidden layers and the number of nodes per layer, that makes the training stable while approximating the solution of a residual neural network with a number of hidden layers approaching infinity.