数据增强如何影响线性回归的优化

论文标题

数据增强如何影响线性回归的优化

How Data Augmentation affects Optimization for Linear Regression

论文作者

Hanin, Boris, Sun, Yi

论文摘要

尽管数据扩展已迅速成为现代机器学习中优化的关键工具，但清楚地了解了增强时间表如何影响优化并与优化超级计（如学习率）相互作用的情况是新生的。本着经典凸的优化和对隐式偏见的最新工作的精神，本工作分析了在线性回归的简单凸环设置和MSE损失的简单凸设置中增强对优化的影响。我们找到了学习率和数据增强方案的联合时间表，根据该计划，增强梯度下降可证明并表征由此产生的最小值。我们的结果适用于任意增强方案，即使在凸环境中，也揭示了学习率和增强之间的复杂相互作用。我们的方法将增强的GD解释为代理损失序列的随机优化方法。这提供了一种统一的方法，可以分析学习率，批处理大小以及从添加噪声到随机预测的增强。从这个角度来看，我们的结果（也给出收敛速度）可以看作是MonroObbins类型的增强条件。

Though data augmentation has rapidly emerged as a key tool for optimization in modern machine learning, a clear picture of how augmentation schedules affect optimization and interact with optimization hyperparameters such as learning rate is nascent. In the spirit of classical convex optimization and recent work on implicit bias, the present work analyzes the effect of augmentation on optimization in the simple convex setting of linear regression with MSE loss. We find joint schedules for learning rate and data augmentation scheme under which augmented gradient descent provably converges and characterize the resulting minimum. Our results apply to arbitrary augmentation schemes, revealing complex interactions between learning rates and augmentations even in the convex setting. Our approach interprets augmented (S)GD as a stochastic optimization method for a time-varying sequence of proxy losses. This gives a unified way to analyze learning rate, batch size, and augmentations ranging from additive noise to random projections. From this perspective, our results, which also give rates of convergence, can be viewed as Monro-Robbins type conditions for augmented (S)GD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题