反相关的噪声注入以改善概括

论文标题

反相关的噪声注入以改善概括

Anticorrelated Noise Injection for Improved Generalization

论文作者

Orvieto, Antonio, Kersting, Hans, Proske, Frank, Bach, Francis, Lucchi, Aurelien

论文摘要

通常使用将人工噪声注入梯度下降（GD）来改善机器学习模型的性能。通常，无关的噪声用于这种干扰梯度下降（PGD）方法。但是，尚不清楚这是否是最佳的，还是其他类型的噪声可以提供更好的概括性能。在本文中，我们缩小了将连续PGD步骤扰动关联的问题。我们考虑各种目标函数，我们发现具有反相关扰动（“抗PGD”）的GD明显优于GD和标准（不相关）PGD。为了支持这些实验发现，我们还得出了一个理论分析，该分析表明抗PGD移至更广泛的最小值，而GD和PGD仍在次优的区域甚至散落中。反相关的噪声与概括之间的这种新联系为训练机器学习模型开发噪声的新方法打开了领域。

Injecting artificial noise into gradient descent (GD) is commonly employed to improve the performance of machine learning models. Usually, uncorrelated noise is used in such perturbed gradient descent (PGD) methods. It is, however, not known if this is optimal or whether other types of noise could provide better generalization performance. In this paper, we zoom in on the problem of correlating the perturbations of consecutive PGD steps. We consider a variety of objective functions for which we find that GD with anticorrelated perturbations ("Anti-PGD") generalizes significantly better than GD and standard (uncorrelated) PGD. To support these experimental findings, we also derive a theoretical analysis that demonstrates that Anti-PGD moves to wider minima, while GD and PGD remain stuck in suboptimal regions or even diverge. This new connection between anticorrelated noise and generalization opens the field to novel ways to exploit noise for training machine learning models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题