论文标题

任务不可知代表合并:一种基于自我监督的持续学习方法

Task Agnostic Representation Consolidation: a Self-supervised based Continual Learning Approach

论文作者

Bhat, Prashant, Zonooz, Bahram, Arani, Elahe

论文摘要

对非平稳数据流的持续学习(CL)仍然是深层神经网络(DNN)的长期挑战之一,因为它们容易遭受灾难性的遗忘。 CL模型可以从自我监督的预训练中受益,因为它可以学习更具概括性的任务不合时宜的功能。但是,随着任务序列的长度的增加,自我监管的预训练的效果会减少。此外,域前数据分布和任务分布之间的域移动降低了学习表示的普遍性。为了解决这些局限性,我们建议任务不可知代表合并(TARC),这是CL的两阶段培训范式,它交织了任务不合时宜的和特定于任务的学习,从而自欺欺人的培训,然后对每个任务进行监督学习。为了进一步限制在自我监督阶段中学到的表示的偏差,我们在监督阶段采用了任务不合时宜的辅助损失。我们表明,我们的培训范式可以轻松地添加到基于内存或正则化的方法中,并在更具挑战性的CL设置中提供一致的性能增长。我们进一步表明,它导致了更健壮且校准的模型。

Continual learning (CL) over non-stationary data streams remains one of the long-standing challenges in deep neural networks (DNNs) as they are prone to catastrophic forgetting. CL models can benefit from self-supervised pre-training as it enables learning more generalizable task-agnostic features. However, the effect of self-supervised pre-training diminishes as the length of task sequences increases. Furthermore, the domain shift between pre-training data distribution and the task distribution reduces the generalizability of the learned representations. To address these limitations, we propose Task Agnostic Representation Consolidation (TARC), a two-stage training paradigm for CL that intertwines task-agnostic and task-specific learning whereby self-supervised training is followed by supervised learning for each task. To further restrict the deviation from the learned representations in the self-supervised stage, we employ a task-agnostic auxiliary loss during the supervised stage. We show that our training paradigm can be easily added to memory- or regularization-based approaches and provides consistent performance gain across more challenging CL settings. We further show that it leads to more robust and well-calibrated models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源