论文标题

重新系统有什么信息压缩?

What Information Does a ResNet Compress?

论文作者

Darlow, Luke Nicholas, Storkey, Amos

论文摘要

信息瓶颈原则(Shwartz-Ziv&Tishby,2017年)表明,从信息理论的角度来看,基于SGD的基于SGD的深度神经网络培训可导致最佳压缩的隐藏层。但是,这一主张是在玩具数据上建立的。我们在此提出的工作的目的是测试信息瓶颈原理是否适用于使用更大,更深的卷积体系结构(Resnet模型)的现实设置。当对(1)分类和(2)自动编码训练时,我们训练了PixelCNN ++模型作为反表示解码器解码器解码器,以测量重新系统和输入图像数据之间的相互信息。我们发现,两个培训制度的学习都发生了两个阶段,即使是自动编码器,也会发生这种压缩。通过在隐藏层的激活上进行调节,对图像进行采样提供了直观的可视化,以了解Resnets学会忘记的内容。

The information bottleneck principle (Shwartz-Ziv & Tishby, 2017) suggests that SGD-based training of deep neural networks results in optimally compressed hidden layers, from an information theoretic perspective. However, this claim was established on toy data. The goal of the work we present here is to test whether the information bottleneck principle is applicable to a realistic setting using a larger and deeper convolutional architecture, a ResNet model. We trained PixelCNN++ models as inverse representation decoders to measure the mutual information between hidden layers of a ResNet and input image data, when trained for (1) classification and (2) autoencoding. We find that two stages of learning happen for both training regimes, and that compression does occur, even for an autoencoder. Sampling images by conditioning on hidden layers' activations offers an intuitive visualisation to understand what a ResNets learns to forget.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源