论文标题
持续的修剪和选择:专业子网的课堂学习
Continual Prune-and-Select: Class-incremental learning with specialized subnetworks
论文作者
论文摘要
人的大脑能够在不忘记的情况下依次学习任务。但是,深度神经网络(DNN)在学习一项任务时遭受了灾难性的遗忘。我们考虑了一个挑战,考虑了一个课堂学习方案,在该方案中,DNN可以看到测试数据而不知道该数据启动的任务。在培训期间,持续的捕获和选择(CP&S)在DNN中找到了负责解决给定任务的子网。然后,在推断期间,CP&S选择正确的子网以对该任务进行预测。通过培训DNN的可用神经元连接(以前未经训练)来创建一个新的子网络,从而通过修剪来学习一项新任务,该连接可以包括以前训练的其他子网络(S),因为它没有更新共享的连接。这使得通过在DNN中创建专门的区域而不会彼此冲突的同时仍允许知识转移在其中,可以消除灾难性的遗忘。 CP&S策略采用不同的子网选择策略实施,揭示了在各种数据集(CIFAR-100,CUB-200,2011年,Imagenet-100和Imagenet-1000)上测试的最先进的持续学习方法的卓越性能。特别是,CP&S能够从Imagenet-1000中依次学习10个任务,以确保94%的准确性,而遗忘可忽略不计,这是课堂学习学习的首要结果。据作者所知,与最佳替代方法相比,这代表了准确性高于10%的改善。
The human brain is capable of learning tasks sequentially mostly without forgetting. However, deep neural networks (DNNs) suffer from catastrophic forgetting when learning one task after another. We address this challenge considering a class-incremental learning scenario where the DNN sees test data without knowing the task from which this data originates. During training, Continual-Prune-and-Select (CP&S) finds a subnetwork within the DNN that is responsible for solving a given task. Then, during inference, CP&S selects the correct subnetwork to make predictions for that task. A new task is learned by training available neuronal connections of the DNN (previously untrained) to create a new subnetwork by pruning, which can include previously trained connections belonging to other subnetwork(s) because it does not update shared connections. This enables to eliminate catastrophic forgetting by creating specialized regions in the DNN that do not conflict with each other while still allowing knowledge transfer across them. The CP&S strategy is implemented with different subnetwork selection strategies, revealing superior performance to state-of-the-art continual learning methods tested on various datasets (CIFAR-100, CUB-200-2011, ImageNet-100 and ImageNet-1000). In particular, CP&S is capable of sequentially learning 10 tasks from ImageNet-1000 keeping an accuracy around 94% with negligible forgetting, a first-of-its-kind result in class-incremental learning. To the best of the authors' knowledge, this represents an improvement in accuracy above 10% when compared to the best alternative method.