弹性重量巩固可以提高自我监督学习方法的鲁棒性

论文标题

弹性重量巩固可以提高自我监督学习方法的鲁棒性

Elastic Weight Consolidation Improves the Robustness of Self-Supervised Learning Methods under Transfer

论文作者

Ovsianas, Andrius, Ramapuram, Jason, Busbridge, Dan, Dhekane, Eeshan Gunesh, Webb, Russ

论文摘要

自我监督的表示学习（SSL）方法为微调下游任务提供了有效的无标签初始条件。但是，在许多现实的情况下，下游任务可能会偏向目标标签分布。反过来，这将学习过的微调模型后部从初始（标签）无偏见的自我监督模型后部移开。在这项工作中，我们在贝叶斯持续学习的镜头下重新解释了SSL微调，并通过弹性重量合并（EWC）框架考虑正则化。我们证明，使用VIT-B/16体系结构时，针对初始SSL主链的自我调节会提高水鸟中最差的亚组表现，而Celeb-A则提高了2％。此外，为了帮助简化SSL使用EWC的使用，我们预先计算并公开释放Fisher Information Matrix（FIM），并通过对大型现代SSL体系结构进行评估的10,000个Imagenet-1K变量进行了评估，包括VIT-B/16，以及对Dino训练的Resnet50进行评估。

Self-supervised representation learning (SSL) methods provide an effective label-free initial condition for fine-tuning downstream tasks. However, in numerous realistic scenarios, the downstream task might be biased with respect to the target label distribution. This in turn moves the learned fine-tuned model posterior away from the initial (label) bias-free self-supervised model posterior. In this work, we re-interpret SSL fine-tuning under the lens of Bayesian continual learning and consider regularization through the Elastic Weight Consolidation (EWC) framework. We demonstrate that self-regularization against an initial SSL backbone improves worst sub-group performance in Waterbirds by 5% and Celeb-A by 2% when using the ViT-B/16 architecture. Furthermore, to help simplify the use of EWC with SSL, we pre-compute and publicly release the Fisher Information Matrix (FIM), evaluated with 10,000 ImageNet-1K variates evaluated on large modern SSL architectures including ViT-B/16 and ResNet50 trained with DINO.

下载PDF全文

下载文献需遵守相关版权规定

论文标题