论文标题

深神经网络中的分层成核

Hierarchical nucleation in deep neural networks

论文作者

Doimo, Diego, Glielmo, Aldo, Ansuini, Alessio, Laio, Alessandro

论文摘要

深度卷积网络(DCN)学习有意义的表示,共享相同抽象特征的数据越来越近。理解这些表示及其产生方式是毫无疑问的实践和理论利益。在这项工作中,我们研究了某些最先进的DCN中隐藏层跨隐藏层的概率密度的演变。我们发现初始层产生的单峰概率密度可以摆脱与分类无关的任何结构。在随后的层中,密度以层次结构形式出现,它反映了概念的语义层次结构。与单个类别相对应的密度峰值仅出现在输出附近,并且通过非常尖锐的跃迁,类似于异质液体的成核过程。这个过程在输出层的概率密度中留下了占地面积,在该峰的峰值层的地形允许重建类别的语义关系。

Deep convolutional networks (DCNs) learn meaningful representations where data that share the same abstract characteristics are positioned closer and closer. Understanding these representations and how they are generated is of unquestioned practical and theoretical interest. In this work we study the evolution of the probability density of the ImageNet dataset across the hidden layers in some state-of-the-art DCNs. We find that the initial layers generate a unimodal probability density getting rid of any structure irrelevant for classification. In subsequent layers density peaks arise in a hierarchical fashion that mirrors the semantic hierarchy of the concepts. Density peaks corresponding to single categories appear only close to the output and via a very sharp transition which resembles the nucleation process of a heterogeneous liquid. This process leaves a footprint in the probability density of the output layer where the topography of the peaks allows reconstructing the semantic relationships of the categories.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源