数据增强是否受益于分组

论文标题

数据增强是否受益于分组

Does Data Augmentation Benefit from Split BatchNorms

论文作者

Merchant, Amil, Zoph, Barret, Cubuk, Ekin Dogus

论文摘要

数据增强已成为一种有力的技术，用于提高深神经网络的性能，并导致最新的计算机视觉结果。但是，最新的数据增强严重扭曲了训练图像，从而导致训练和推理期间看到的示例之间存在差异。在这项工作中，我们探索了最近提出的训练范式以纠正这种差异：使用辅助batchnorm进行潜在的分发，强烈增强的图像。然后，我们的实验集中于如何定义评估中使用的批处理参数。为了消除火车测试差异，我们尝试使用仅由干净训练图像定义的批处理统计数据，但令人惊讶的是，这不会带来模型性能的改善。取而代之的是，我们使用由弱增强定义的batchNorm参数进行了研究，发现此方法显着提高了常见图像分类基准（例如CIFAR-10，CIFAR-100和Imagenet）的性能。然后，我们探讨了使用不同的batchnorm参数来自精确度和鲁棒性之间的基本权衡，从而更深入地了解数据增强对模型性能的好处。

Data augmentation has emerged as a powerful technique for improving the performance of deep neural networks and led to state-of-the-art results in computer vision. However, state-of-the-art data augmentation strongly distorts training images, leading to a disparity between examples seen during training and inference. In this work, we explore a recently proposed training paradigm in order to correct for this disparity: using an auxiliary BatchNorm for the potentially out-of-distribution, strongly augmented images. Our experiments then focus on how to define the BatchNorm parameters that are used at evaluation. To eliminate the train-test disparity, we experiment with using the batch statistics defined by clean training images only, yet surprisingly find that this does not yield improvements in model performance. Instead, we investigate using BatchNorm parameters defined by weak augmentations and find that this method significantly improves the performance of common image classification benchmarks such as CIFAR-10, CIFAR-100, and ImageNet. We then explore a fundamental trade-off between accuracy and robustness coming from using different BatchNorm parameters, providing greater insight into the benefits of data augmentation on model performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题