论文标题
通过梯度路径分析设计网络设计策略
Designing Network Design Strategies Through Gradient Path Analysis
论文作者
论文摘要
设计高效和高质量的表达网络体系结构一直是深度学习领域中最重要的研究主题。当今的大多数网络设计策略都集中在如何整合从不同层中提取的功能,以及如何设计计算单元以有效提取这些功能,从而增强网络的表现力。本文提出了一种新的网络设计策略,即基于梯度路径分析设计网络体系结构。总体而言,当今的大多数主流网络设计策略都是基于馈向路径,也就是说,网络体系结构是基于数据路径设计的。在本文中,我们希望通过提高网络学习能力来增强受过训练的模型的表现能力。由于驱动网络参数学习的机制是向后传播算法,因此我们根据背部传播路径设计网络设计策略。我们提出了层级,阶段级别和网络级别的梯度路径设计策略,并且从理论分析和实验中证明,设计策略被证明是优越的,并且是可行的。
Designing a high-efficiency and high-quality expressive network architecture has always been the most important research topic in the field of deep learning. Most of today's network design strategies focus on how to integrate features extracted from different layers, and how to design computing units to effectively extract these features, thereby enhancing the expressiveness of the network. This paper proposes a new network design strategy, i.e., to design the network architecture based on gradient path analysis. On the whole, most of today's mainstream network design strategies are based on feed forward path, that is, the network architecture is designed based on the data path. In this paper, we hope to enhance the expressive ability of the trained model by improving the network learning ability. Due to the mechanism driving the network parameter learning is the backward propagation algorithm, we design network design strategies based on back propagation path. We propose the gradient path design strategies for the layer-level, the stage-level, and the network-level, and the design strategies are proved to be superior and feasible from theoretical analysis and experiments.