从压缩的角度持续学习

论文标题

从压缩的角度持续学习

Continual Learning from the Perspective of Compression

论文作者

He, Xu, Lin, Min

论文摘要

神经网络等连接主义模型遭受灾难性遗忘。在这项工作中，我们从信息理论的角度研究了这个问题，并将忘记定义为以前数据的描述长度的增加，当它们通过顺序学习的模型压缩时。此外，我们表明，基于变异后近似值和生成重播的持续学习方法可以被视为压缩中两种优先编码方法的近似值，即贝叶斯混合物代码和最大似然（ML）插件代码。我们根据压缩和遗忘和经验研究了这些方法限制基于变异后近似持续学习方法的性能的原因。为了解决这些限制，我们提出了一种新的连续学习方法，将ML插件和贝叶斯混合物代码结合在一起。

Connectionist models such as neural networks suffer from catastrophic forgetting. In this work, we study this problem from the perspective of information theory and define forgetting as the increase of description lengths of previous data when they are compressed with a sequentially learned model. In addition, we show that continual learning approaches based on variational posterior approximation and generative replay can be considered as approximations to two prequential coding methods in compression, namely, the Bayesian mixture code and maximum likelihood (ML) plug-in code. We compare these approaches in terms of both compression and forgetting and empirically study the reasons that limit the performance of continual learning methods based on variational posterior approximation. To address these limitations, we propose a new continual learning method that combines ML plug-in and Bayesian mixture codes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题