可证明有效的持续表示学习

论文标题

可证明有效的持续表示学习

Provable and Efficient Continual Representation Learning

论文作者

Li, Yingcong, Li, Mingchen, Asif, M. Salman, Oymak, Samet

论文摘要

在持续学习（CL）中，目标是设计可以学习一系列任务的模型，而不会造成灾难性的遗忘。尽管CL有丰富的技术，但对以前任务构建的表示形式有益于添加到网络中的新任务。为了解决这个问题，我们研究了连续表示学习（CRL）的问题，在新任务到达时，我们将学习不断发展的代表。专注于零删除方法，其中将任务嵌入子网（例如PackNet）中，我们首先提供了证明CRL的实验，可以在学习新任务时显着提高样本效率。为了解释这一点，我们通过形式化以前学习的表示的统计益处来为新任务提供样本复杂性和概括误差界限，从而为CRL建立理论保证。我们的分析和实验还强调了我们学习任务的顺序的重要性。具体来说，我们表明，如果初始任务具有较大的样本量和高“表示多样性”，则CL益处。多样性确保添加新任务会导致小的表示不匹配，并且可以通过几个样本学习，而仅培训少量额外的非零权重。最后，我们询问是否可以确保在推理期间为每个任务子网效率保持效率，同时保留代表性学习的好处。为此，我们提出了一种称为有效稀疏包装网（ESPN）的包装网络的推理效率变化，该变化采用了联合通道和修剪。 ESPN将任务嵌入到频道的频道子网中，需要少80％的拖鞋来计算，同时近似保持准确性，并且与各种基线具有非常有竞争力的竞争力。总而言之，这项工作迈出了一步，朝着具有表示学习的角度来计算效率的CL。 github页面：https：//github.com/ucr-optml/ctrl

In continual learning (CL), the goal is to design models that can learn a sequence of tasks without catastrophic forgetting. While there is a rich set of techniques for CL, relatively little understanding exists on how representations built by previous tasks benefit new tasks that are added to the network. To address this, we study the problem of continual representation learning (CRL) where we learn an evolving representation as new tasks arrive. Focusing on zero-forgetting methods where tasks are embedded in subnetworks (e.g., PackNet), we first provide experiments demonstrating CRL can significantly boost sample efficiency when learning new tasks. To explain this, we establish theoretical guarantees for CRL by providing sample complexity and generalization error bounds for new tasks by formalizing the statistical benefits of previously-learned representations. Our analysis and experiments also highlight the importance of the order in which we learn the tasks. Specifically, we show that CL benefits if the initial tasks have large sample size and high "representation diversity". Diversity ensures that adding new tasks incurs small representation mismatch and can be learned with few samples while training only few additional nonzero weights. Finally, we ask whether one can ensure each task subnetwork to be efficient during inference time while retaining the benefits of representation learning. To this end, we propose an inference-efficient variation of PackNet called Efficient Sparse PackNet (ESPN) which employs joint channel & weight pruning. ESPN embeds tasks in channel-sparse subnets requiring up to 80% less FLOPs to compute while approximately retaining accuracy and is very competitive with a variety of baselines. In summary, this work takes a step towards data and compute-efficient CL with a representation learning perspective. GitHub page: https://github.com/ucr-optml/CtRL

下载PDF全文

下载文献需遵守相关版权规定

论文标题