学习深度表示的范例归一化

论文标题

学习深度表示的范例归一化

Exemplar Normalization for Learning Deep Representation

论文作者

Zhang, Ruimao, Peng, Zhanglin, Wu, Lingyun, Li, Zhen, Luo, Ping

论文摘要

标准化技术在不同的高级神经网络和不同的任务中很重要。这项工作通过提出示例归一化（EN）来研究一种新型的动态学习与归一化问题（L2N）问题，该示例能够学习深网的不同卷积层和图像样本的不同归一化方法。 EN显着提高了最近提出的可切换归一化（SN）的灵活性，该范围通过在每个归一化层中的几个归一化器（对于所有样品中的组合均相同）来解决静态L2N问题。 EN的内部体系结构并没有直接使用多层感知器（MLP）来学习数据依赖性参数（CBN），而是精心设计的内部体系结构旨在稳定其优化，从而带来许多吸引人的好处。（1）启用不同的卷积层，图像样本，类别，基准和任务，以使用不同的归一化方法，从而在整体视图中阐明它们。（2）EN对于各种网络架构和任务有效。（3）它可以替代深层网络中的任何归一化层，并且仍然会产生稳定的模型训练。广泛的实验证明了EN在各种任务中的有效性，包括图像识别，嘈杂的标签学习和语义分割。例如，通过更换普通Resnet50中的BN，由EN产生的改进比Imagenet和嘈杂的网络视频数据集中的SN高300％。

Normalization techniques are important in different advanced neural networks and different tasks. This work investigates a novel dynamic learning-to-normalize (L2N) problem by proposing Exemplar Normalization (EN), which is able to learn different normalization methods for different convolutional layers and image samples of a deep network. EN significantly improves flexibility of the recently proposed switchable normalization (SN), which solves a static L2N problem by linearly combining several normalizers in each normalization layer (the combination is the same for all samples). Instead of directly employing a multi-layer perceptron (MLP) to learn data-dependent parameters as conditional batch normalization (cBN) did, the internal architecture of EN is carefully designed to stabilize its optimization, leading to many appealing benefits. (1) EN enables different convolutional layers, image samples, categories, benchmarks, and tasks to use different normalization methods, shedding light on analyzing them in a holistic view. (2) EN is effective for various network architectures and tasks. (3) It could replace any normalization layers in a deep network and still produce stable model training. Extensive experiments demonstrate the effectiveness of EN in a wide spectrum of tasks including image recognition, noisy label learning, and semantic segmentation. For example, by replacing BN in the ordinary ResNet50, improvement produced by EN is 300% more than that of SN on both ImageNet and the noisy WebVision dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题