长尾视觉识别的平衡对比度学习

论文标题

长尾视觉识别的平衡对比度学习

Balanced Contrastive Learning for Long-Tailed Visual Recognition

论文作者

Zhu, Jianggang, Wang, Zheng, Chen, Jingjing, Chen, Yi-Ping Phoebe, Jiang, Yu-Gang

论文摘要

现实世界中的数据通常遵循长尾巴的分布，其中一些多数类别占据了大多数数据，而大多数少数族裔类别都包含有限数量的样本。分类模型最小化的跨凝结努力来代表和对尾部类别进行分类。尽管对学习公正的分类器的学习问题进行了充分的研究，但代表数据不平衡数据的方法却没有探索。在本文中，我们专注于表示不平衡数据的表示。最近，受到监督的对比学习最近在平衡数据上表现出了有希望的表现。但是，通过我们的理论分析，我们发现对于长尾数据，它无法形成常规的单纯形，这是表示表示学习的理想几何配置。为了纠正SCL的优化行为并进一步提高了长尾视觉识别的性能，我们提出了平衡对比度学习（BCL）的新型损失。与SCL相比，我们在BCL：类平均水平方面有两个改进，可以平衡负类的梯度贡献。课堂组合，允许所有类都出现在每个迷你批次中。提出的平衡对比度学习（BCL）方法满足形成常规单纯形的条件并有助于优化跨透明拷贝。配备了BCL，提出的两分支框架可以获得更强的特征表示，并在诸如CIFAR-10-LT，CIFAR-100-LT，Imagenet-LT和Inaturalist2018等长尾基准数据集上实现竞争性能。我们的代码可在https://github.com/flamiezhu/bcl上找到。

Real-world data typically follow a long-tailed distribution, where a few majority categories occupy most of the data while most minority categories contain a limited number of samples. Classification models minimizing cross-entropy struggle to represent and classify the tail classes. Although the problem of learning unbiased classifiers has been well studied, methods for representing imbalanced data are under-explored. In this paper, we focus on representation learning for imbalanced data. Recently, supervised contrastive learning has shown promising performance on balanced data recently. However, through our theoretical analysis, we find that for long-tailed data, it fails to form a regular simplex which is an ideal geometric configuration for representation learning. To correct the optimization behavior of SCL and further improve the performance of long-tailed visual recognition, we propose a novel loss for balanced contrastive learning (BCL). Compared with SCL, we have two improvements in BCL: class-averaging, which balances the gradient contribution of negative classes; class-complement, which allows all classes to appear in every mini-batch. The proposed balanced contrastive learning (BCL) method satisfies the condition of forming a regular simplex and assists the optimization of cross-entropy. Equipped with BCL, the proposed two-branch framework can obtain a stronger feature representation and achieve competitive performance on long-tailed benchmark datasets such as CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iNaturalist2018. Our code is available at https://github.com/FlamieZhu/BCL .

下载PDF全文

下载文献需遵守相关版权规定

论文标题