论文标题
通过形状敏化来鲁棒性模型
Robustifying Deep Vision Models Through Shape Sensitization
论文作者
论文摘要
最近的工作表明,深视觉模型往往过于依赖低级或“纹理”特征,从而导致概括不良。已经提出了各种数据增强策略来克服DNN中所谓的纹理偏见。我们提出了一种简单,轻巧的对抗增强技术,该技术明确激励网络学习整体形状,以在对象分类设置中进行准确的预测。我们使用随机确定的混合比例与EDGEMAP图像的图像标签,从一个图像上从一个图像到另一个图像上的超级颜色Edgemaps。为了对这些增强图像进行分类,该模型不仅需要检测和专注于边缘,而且要区分相关和伪边缘。我们表明,我们的增强显着提高了一系列数据集和神经体系结构的分类准确性和鲁棒性度量。例如,对于VIT-S,我们获得了分类准确性的绝对提高,最多可获得6%。我们还分别获得了自然对抗和分布数据集(例如Imagenet-A(对于VIT-B)和Imagenet-R(对于VIT-S)(VIT-S))的收益高达28%和8.5%。使用一系列探针数据集的分析显示,在我们训练的模型中,形状灵敏度大大提高,这解释了观察到的鲁棒性和分类精度的提高。
Recent work has shown that deep vision models tend to be overly dependent on low-level or "texture" features, leading to poor generalization. Various data augmentation strategies have been proposed to overcome this so-called texture bias in DNNs. We propose a simple, lightweight adversarial augmentation technique that explicitly incentivizes the network to learn holistic shapes for accurate prediction in an object classification setting. Our augmentations superpose edgemaps from one image onto another image with shuffled patches, using a randomly determined mixing proportion, with the image label of the edgemap image. To classify these augmented images, the model needs to not only detect and focus on edges but distinguish between relevant and spurious edges. We show that our augmentations significantly improve classification accuracy and robustness measures on a range of datasets and neural architectures. As an example, for ViT-S, We obtain absolute gains on classification accuracy gains up to 6%. We also obtain gains of up to 28% and 8.5% on natural adversarial and out-of-distribution datasets like ImageNet-A (for ViT-B) and ImageNet-R (for ViT-S), respectively. Analysis using a range of probe datasets shows substantially increased shape sensitivity in our trained models, explaining the observed improvement in robustness and classification accuracy.