关于神经网络分类器的最终和倒数第二层中的单纯形对称性的出现

论文标题

关于神经网络分类器的最终和倒数第二层中的单纯形对称性的出现

On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers

论文作者

E, Weinan, Wojtowytsch, Stephan

论文摘要

最近的一项数值研究观察到，神经网络分类器在倒数第二层中享有大量的对称性。也就是说，如果$ h（x）= af（x） +b $其中$ a $是线性地图，而$ f $是网络倒数第二层的输出（激活后），则所有数据点$ x_ {i，1}，\ dots，x__ {i，n_i} $ in Class $ c_i $ y_i $ y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y y_ $ y y_ $ y y_ $ y y_ $ y y_ y y y y $ y y y y $ y y y_ y $高维欧几里得空间中常规$ k-1 $维标准单纯的顶点。我们在高度表现力深神经网络的玩具模型中分析地解释了这一观察结果。在互补的示例中，我们严格证明，即使分类器$ h $的最终输出也不是$ c_i $的数据样本，如果$ h $是浅网络（或者，如果较深的层未将数据样本带入方便的几何配置中）。

A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, \dots, x_{i, N_i}$ in a class $C_i$ are mapped to a single point $y_i$ by $f$ and the points $y_i$ are located at the vertices of a regular $k-1$-dimensional standard simplex in a high-dimensional Euclidean space. We explain this observation analytically in toy models for highly expressive deep neural networks. In complementary examples, we demonstrate rigorously that even the final output of the classifier $h$ is not uniform over data samples from a class $C_i$ if $h$ is a shallow network (or if the deeper layers do not bring the data samples into a convenient geometric configuration).

下载PDF全文

下载文献需遵守相关版权规定

论文标题