论文标题
Gradaug:一种深层神经网络的新正规化方法
GradAug: A New Regularization Method for Deep Neural Networks
论文作者
论文摘要
我们提出了一种新的正则化方法,以减轻深层神经网络中的过度拟合。关键思想是利用随机转换的训练样本来正规化一组子网络,这些子网络是通过在培训过程中对原始网络的宽度进行采样的。因此,提出的方法将自引导的干扰引入了网络的原始梯度,因此被称为梯度增强(Gradaug)。我们证明,Gradaug可以帮助网络学习良好的代表和更多样化的表示。此外,它易于实施,并且可以应用于各种结构和应用程序。 Gradaug在Imagenet分类方面提高了Resnet-50至78.79%,这是一种新的最先进的准确性。通过与CutMix结合使用,它将表现进一步提高79.67%,这表现优于先进的训练技巧。对可可对象检测和实例分割评估了概括能力,其中Gradaug显着超过了其他最新方法。 Gradaug对图像扭曲和FGSM对抗攻击也很健壮,并且在低数据方案中非常有效。代码可从https://github.com/taoyang1122/gradaug获得
We propose a new regularization method to alleviate over-fitting in deep neural networks. The key idea is utilizing randomly transformed training samples to regularize a set of sub-networks, which are originated by sampling the width of the original network, in the training process. As such, the proposed method introduces self-guided disturbances to the raw gradients of the network and therefore is termed as Gradient Augmentation (GradAug). We demonstrate that GradAug can help the network learn well-generalized and more diverse representations. Moreover, it is easy to implement and can be applied to various structures and applications. GradAug improves ResNet-50 to 78.79% on ImageNet classification, which is a new state-of-the-art accuracy. By combining with CutMix, it further boosts the performance to 79.67%, which outperforms an ensemble of advanced training tricks. The generalization ability is evaluated on COCO object detection and instance segmentation where GradAug significantly surpasses other state-of-the-art methods. GradAug is also robust to image distortions and FGSM adversarial attacks and is highly effective in low data regimes. Code is available at https://github.com/taoyang1122/GradAug