论文标题
对比对比和非对抗性学习的景观
Contrasting the landscape of contrastive and non-contrastive learning
论文作者
论文摘要
无监督功能学习的许多最新进展是基于设计功能,这些功能在语义数据增强下是不变的。做到这一点的一种常见方法是对比度学习,它使用正面和负面样本。然而,最近的一些作品显示了非对抗性学习的有希望的结果,这不需要负样本。但是,非对比度损失具有明显的“折叠”最小值,其中编码器输出恒定特征嵌入,而与输入无关。民间猜想是,只要避免使用这些崩溃的解决方案,就可以很好地表征产生的特征表示。在我们的论文中,我们对这个故事产生了怀疑:我们通过理论结果和受控实验来表明,即使在简单的数据模型上,非对抗性损失也具有大量的非collapsed不良最小值。此外,我们表明训练过程并不能避免这些最小值。
A lot of recent advances in unsupervised feature learning are based on designing features which are invariant under semantic data augmentations. A common way to do this is contrastive learning, which uses positive and negative samples. Some recent works however have shown promising results for non-contrastive learning, which does not require negative samples. However, the non-contrastive losses have obvious "collapsed" minima, in which the encoders output a constant feature embedding, independent of the input. A folk conjecture is that so long as these collapsed solutions are avoided, the produced feature representations should be good. In our paper, we cast doubt on this story: we show through theoretical results and controlled experiments that even on simple data models, non-contrastive losses have a preponderance of non-collapsed bad minima. Moreover, we show that the training process does not avoid these minima.