论文标题
可区分性可转移性权衡:信息理论的观点
Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective
论文作者
论文摘要
这项工作同时考虑了典型的监督学习任务(即图像分类)中深度表示的可区分性和可传递性能。通过全面的时间分析,我们观察到这两个属性之间的权衡。随着培训的进展,可区分性不断提高,而转移性在后来的培训期间会大大降低。 从信息 - 底层理论的角度来看,我们揭示了可区分性和可传递性之间的不相容性归因于输入信息的过度压缩。更重要的是,我们研究了为什么和如何减轻过度压缩的信息,并进一步提出一个学习框架,称为对比度的时间编码〜(CTC),以抵消过度压缩并减轻不相容性。广泛的实验验证了CTC成功缓解了不相容性,从而产生了歧视性和可转移表示形式。在图像分类任务和挑战转移学习任务上实现了明显的改进。我们希望这项工作能够在传统的监督学习环境中提高可转移性属性的重要性。代码可从https://github.com/dtennant/dt-tradeoff获得。
This work simultaneously considers the discriminability and transferability properties of deep representations in the typical supervised learning task, i.e., image classification. By a comprehensive temporal analysis, we observe a trade-off between these two properties. The discriminability keeps increasing with the training progressing while the transferability intensely diminishes in the later training period. From the perspective of information-bottleneck theory, we reveal that the incompatibility between discriminability and transferability is attributed to the over-compression of input information. More importantly, we investigate why and how the InfoNCE loss can alleviate the over-compression, and further present a learning framework, named contrastive temporal coding~(CTC), to counteract the over-compression and alleviate the incompatibility. Extensive experiments validate that CTC successfully mitigates the incompatibility, yielding discriminative and transferable representations. Noticeable improvements are achieved on the image classification task and challenging transfer learning tasks. We hope that this work will raise the significance of the transferability property in the conventional supervised learning setting. Code is available at https://github.com/DTennant/dt-tradeoff.