神经电容：通过边缘动力学选择神经网络选择的新观点

论文标题

神经电容：通过边缘动力学选择神经网络选择的新观点

Neural Capacitance: A New Perspective of Neural Network Selection via Edge Dynamics

论文作者

Jiang, Chunheng, Pedapati, Tejaswini, Chen, Pin-Yu, Sun, Yizhou, Gao, Jianxi

论文摘要

有效的模型选择用于将合适的预训练的神经网络识别为下游任务是一项基本而又挑战性的任务。当前的实践需要在绩效预测的模型培训中昂贵的计算成本。在本文中，我们通过分析训练期间的突触连接（边缘）的控制动力来提出一个新颖的神经网络选择框架。我们的框架建立在以下事实的基础上，即神经网络训练期间的背部传播相当于突触连接的动态演变。因此，收敛的神经网络与由这些边缘组成的网络系统的平衡状态相关联。为此，我们构建了一个网络映射$ ϕ $，将神经网络$ g_a $转换为有向线图$ g_b $，该$ g_b $在$ g_a $中定义。接下来，我们得出了一个神经电容度量$β_ {\ rm eff} $，作为一种预测度量，可以普遍捕获$ g_a $在下游任务上仅使用一些早期培训结果在下游任务上的概括。我们使用17种流行的预训练的成像网模型和5个基准数据集进行了广泛的实验，包括CIFAR10，CIFAR100，SVHN，时尚MNIST和BIRDS，以评估我们框架的微调性能。我们的神经电容指标被证明是仅基于早期训练结果的模型选择的有力指标，并且比最先进的方法更有效。

Efficient model selection for identifying a suitable pre-trained neural network to a downstream task is a fundamental yet challenging task in deep learning. Current practice requires expensive computational costs in model training for performance prediction. In this paper, we propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training. Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections. Therefore, a converged neural network is associated with an equilibrium state of a networked system composed of those edges. To this end, we construct a network mapping $ϕ$, converting a neural network $G_A$ to a directed line graph $G_B$ that is defined on those edges in $G_A$. Next, we derive a neural capacitance metric $β_{\rm eff}$ as a predictive measure universally capturing the generalization capability of $G_A$ on the downstream task using only a handful of early training results. We carried out extensive experiments using 17 popular pre-trained ImageNet models and five benchmark datasets, including CIFAR10, CIFAR100, SVHN, Fashion MNIST and Birds, to evaluate the fine-tuning performance of our framework. Our neural capacitance metric is shown to be a powerful indicator for model selection based only on early training results and is more efficient than state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题