论文标题
用于培训残差网络的差异游戏理论神经优化器
A Differential Game Theoretic Neural Optimizer for Training Residual Networks
论文作者
论文摘要
深度神经网络(DNNS)训练与最佳控制理论之间的联系吸引了广泛的关注,作为算法设计的原则性工具。差异动态编程(DDP)神经优化器是沿该线的最近提出的方法。尽管其经验成功,但适用性仍限于前馈网络,并且是否可以将这种轨迹优化启发的框架扩展到现代体系结构尚不清楚。在这项工作中,我们得出了一个广义的DDP优化器,该优化器同时接受残差连接和卷积层。由此产生的最佳控制表示形式承认了游戏理论观点,其中训练残留网络可以解释为对国家授权的动态系统的合作轨迹优化。该游戏理论DDP(GT-DDP)优化器在以前的工作中享有相同的理论连接,但生成了一个复杂的更新规则,可以更好地利用网络传播期间的可用信息。对图像分类数据集的评估(例如MNIST和CIFAR100)显示了训练收敛性和差异的改善,而差异比现有方法的差异有所改善。我们的方法突出了从建筑学优化中获得的好处。
Connections between Deep Neural Networks (DNNs) training and optimal control theory has attracted considerable attention as a principled tool of algorithmic design. Differential Dynamic Programming (DDP) neural optimizer is a recently proposed method along this line. Despite its empirical success, the applicability has been limited to feedforward networks and whether such a trajectory-optimization inspired framework can be extended to modern architectures remains unclear. In this work, we derive a generalized DDP optimizer that accepts both residual connections and convolution layers. The resulting optimal control representation admits a game theoretic perspective, in which training residual networks can be interpreted as cooperative trajectory optimization on state-augmented dynamical systems. This Game Theoretic DDP (GT-DDP) optimizer enjoys the same theoretic connection in previous work, yet generates a much complex update rule that better leverages available information during network propagation. Evaluation on image classification datasets (e.g. MNIST and CIFAR100) shows an improvement in training convergence and variance reduction over existing methods. Our approach highlights the benefit gained from architecture-aware optimization.