重新思考渔民的指数平均

论文标题

重新思考渔民的指数平均

Rethinking Exponential Averaging of the Fisher

论文作者

Puiu, Constantin Octavian

论文摘要

在针对机器学习（ML）的优化中，典型的曲率 - 马trix（CM）估计依赖于局部估计的指数平均值（给出EA-CM算法）。这种方法几乎没有原则上的理由，但是经常在实践中使用。在本文中，我们在EA-CM算法和所谓的“二次正规化模型的唤醒”之间建立了联系。概述的连接使我们能够从优化的角度了解EA-CM算法正在做什么。从已建立的联系中概括，我们提出了一种新的算法系列，即“ KL-Divergence唤醒指定模型”（KLD-WRM）。我们给出了KLD-WRM的三种不同实例化的实例，并以数字表明这些实例化在MNIST上的k-fac表现优于K-FAC。

In optimization for Machine learning (ML), it is typical that curvature-matrix (CM) estimates rely on an exponential average (EA) of local estimates (giving EA-CM algorithms). This approach has little principled justification, but is very often used in practice. In this paper, we draw a connection between EA-CM algorithms and what we call a "Wake of Quadratic regularized models". The outlined connection allows us to understand what EA-CM algorithms are doing from an optimization perspective. Generalizing from the established connection, we propose a new family of algorithms, "KL-Divergence Wake-Regularized Models" (KLD-WRM). We give three different practical instantiations of KLD-WRM, and show numerically that these outperform K-FAC on MNIST.

下载PDF全文

下载文献需遵守相关版权规定

论文标题