任务不可知代表合并：一种基于自我监督的持续学习方法

论文标题

任务不可知代表合并：一种基于自我监督的持续学习方法

Task Agnostic Representation Consolidation: a Self-supervised based Continual Learning Approach

论文作者

Bhat, Prashant, Zonooz, Bahram, Arani, Elahe

论文摘要

对非平稳数据流的持续学习（CL）仍然是深层神经网络（DNN）的长期挑战之一，因为它们容易遭受灾难性的遗忘。 CL模型可以从自我监督的预训练中受益，因为它可以学习更具概括性的任务不合时宜的功能。但是，随着任务序列的长度的增加，自我监管的预训练的效果会减少。此外，域前数据分布和任务分布之间的域移动降低了学习表示的普遍性。为了解决这些局限性，我们建议任务不可知代表合并（TARC），这是CL的两阶段培训范式，它交织了任务不合时宜的和特定于任务的学习，从而自欺欺人的培训，然后对每个任务进行监督学习。为了进一步限制在自我监督阶段中学到的表示的偏差，我们在监督阶段采用了任务不合时宜的辅助损失。我们表明，我们的培训范式可以轻松地添加到基于内存或正则化的方法中，并在更具挑战性的CL设置中提供一致的性能增长。我们进一步表明，它导致了更健壮且校准的模型。

Continual learning (CL) over non-stationary data streams remains one of the long-standing challenges in deep neural networks (DNNs) as they are prone to catastrophic forgetting. CL models can benefit from self-supervised pre-training as it enables learning more generalizable task-agnostic features. However, the effect of self-supervised pre-training diminishes as the length of task sequences increases. Furthermore, the domain shift between pre-training data distribution and the task distribution reduces the generalizability of the learned representations. To address these limitations, we propose Task Agnostic Representation Consolidation (TARC), a two-stage training paradigm for CL that intertwines task-agnostic and task-specific learning whereby self-supervised training is followed by supervised learning for each task. To further restrict the deviation from the learned representations in the self-supervised stage, we employ a task-agnostic auxiliary loss during the supervised stage. We show that our training paradigm can be easily added to memory- or regularization-based approaches and provides consistent performance gain across more challenging CL settings. We further show that it leads to more robust and well-calibrated models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题