朝着班级增量学习的多样化评估：代表学习观点

论文标题

朝着班级增量学习的多样化评估：代表学习观点

Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective

论文作者

Cha, Sungmin, Kwak, Jihwan, Shim, Dongsub, Kim, Hyunwoo, Lee, Moontae, Lee, Honglak, Moon, Taesup

论文摘要

类增量学习（CIL）算法的目的是从逐步到达数据中不断学习新对象类，同时又不忘记过去的学习类。 CIL算法的常见评估方案是衡量到目前为止所学到的所有类别的平均测试准确性 - 但是，我们认为，仅关注测试准确性最大化可能不一定会导致CIL算法不断开发也可以不断学习和更新表示的CIL算法，从而可以将其传递到下游任务。为此，我们通过在表示学习中的各种评估方案并提出了新的分析方法，实验性地分析了通过CIL算法训练的神经网络模型。我们的实验表明，大多数最先进的算法优先考虑高稳定性，并且不会显着改变学习的表示，有时甚至比天真基线学习质量较低的表示。但是，我们观察到这些算法仍然可以达到高测试准确性，因为它们使模型能够学习一个非常相似的分类器，该分类器与经过线性探测的估计线性分类器非常相似。此外，涉及单任务学习的第一个任务中学习的基本模型在不同算法之间表现出不同水平的表示质量，并且这种差异会影响CIL算法的最终性能。因此，我们建议代表级评估应被视为CIL算法更多样化评估的附加秘诀。

Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data while not forgetting past learned classes. The common evaluation protocol for CIL algorithms is to measure the average test accuracy across all classes learned so far -- however, we argue that solely focusing on maximizing the test accuracy may not necessarily lead to developing a CIL algorithm that also continually learns and updates the representations, which may be transferred to the downstream tasks. To that end, we experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning and propose new analysis methods. Our experiments show that most state-of-the-art algorithms prioritize high stability and do not significantly change the learned representation, and sometimes even learn a representation of lower quality than a naive baseline. However, we observe that these algorithms can still achieve high test accuracy because they enable a model to learn a classifier that closely resembles an estimated linear classifier trained for linear probing. Furthermore, the base model learned in the first task, which involves single-task learning, exhibits varying levels of representation quality across different algorithms, and this variance impacts the final performance of CIL algorithms. Therefore, we suggest that the representation-level evaluation should be considered as an additional recipe for more diverse evaluation for CIL algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题