论文标题
通过基于能量的对比代表转移增压数据学习
Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer
论文作者
论文摘要
处理严重的阶级不平衡对现实世界的应用构成了重大挑战,尤其是当少数群体的准确分类和概括是主要兴趣时。在计算机视觉中,从长时间的数据集中学习是一个反复出现的主题,尤其是对于自然图像数据集。尽管现有的解决方案主要吸引采样或加权调整以减轻病理失衡或施加归纳偏见以优先考虑非流行关联,但我们采取新颖的观点来促进样本效率和基于因果关系不变性原则的模型概括。我们的建议提出了一个元分布场景,其中数据生成机制在标签条件特征分布中不变。这种因果假设可以使有效的知识转移从主导类别到其代表性不足的对应物,即使各自的特征分布显示出明显的差异。这使我们能够利用因果数据通货膨胀程序来扩大少数类别的代表。因此,我们的发展与现有的极端分类技术是正交的,因此可以无缝整合。我们的提案的实用性通过针对SOTA解决方案的一系列合成和现实的计算机视觉任务进行了验证。
Dealing with severe class imbalance poses a major challenge for real-world applications, especially when the accurate classification and generalization of minority classes is of primary interest. In computer vision, learning from long tailed datasets is a recurring theme, especially for natural image datasets. While existing solutions mostly appeal to sampling or weighting adjustments to alleviate the pathological imbalance, or imposing inductive bias to prioritize non-spurious associations, we take novel perspectives to promote sample efficiency and model generalization based on the invariance principles of causality. Our proposal posits a meta-distributional scenario, where the data generating mechanism is invariant across the label-conditional feature distributions. Such causal assumption enables efficient knowledge transfer from the dominant classes to their under-represented counterparts, even if the respective feature distributions show apparent disparities. This allows us to leverage a causal data inflation procedure to enlarge the representation of minority classes. Our development is orthogonal to the existing extreme classification techniques thus can be seamlessly integrated. The utility of our proposal is validated with an extensive set of synthetic and real-world computer vision tasks against SOTA solutions.