用于增量学习的内存变压器网络

论文标题

用于增量学习的内存变压器网络

A Memory Transformer Network for Incremental Learning

论文作者

Iscen, Ahmet, Bird, Thomas, Caron, Mathilde, Fathi, Alireza, Schmid, Cordelia

论文摘要

我们研究课堂学习，这是一种培训设置，在该设置中，随着时间的推移，观察到新的数据供模型学习。尽管出现了简单的问题表述，但分类模型在课堂学习中的幼稚应用导致了先前看到的类的“灾难性遗忘”。现有最成功的现有方法之一是使用了示例的记忆，这克服了灾难性遗忘的问题，通过将过去数据的子集保存到内存库中，并利用它在训练未来的任务时忘记了它。在我们的论文中，我们建议增强该内存库的利用：我们不仅将其用作现有作品（如现有作品）的其他培训数据的来源，而且还将其显然将其集成到预测过程中。我们进行广泛的实验和消融来评估我们的方法。我们表明，MTN在具有挑战性的Imagenet-1K和Google-Landmarks-1K增量学习基准上实现了最新的性能。

We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from. Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes. One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks. In our paper, we propose to enhance the utilization of this memory bank: we not only use it as a source of additional training data like existing works but also integrate it in the prediction process explicitly.Our method, the Memory Transformer Network (MTN), learns how to combine and aggregate the information from the nearest neighbors in the memory with a transformer to make more accurate predictions. We conduct extensive experiments and ablations to evaluate our approach. We show that MTN achieves state-of-the-art performance on the challenging ImageNet-1k and Google-Landmarks-1k incremental learning benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题