多域长尾学习通过增强删除的表示形式

论文标题

多域长尾学习通过增强删除的表示形式

Multi-Domain Long-Tailed Learning by Augmenting Disentangled Representations

论文作者

Yang, Xinyu, Yao, Huaxiu, Zhou, Allan, Finn, Chelsea

论文摘要

在许多现实世界的分类问题中，存在一个不可避免的长尾不平衡问题。解决此问题的当前方法仅考虑所有示例都来自相同分布的方案。但是，在许多情况下，有多个域具有不同的类失衡。我们研究了这个多域长尾学习问题，并旨在产生一个在所有阶级和领域中都很好的模型。为了实现这一目标，我们介绍了Tally，一种解决这个多域长尾学习问题的方法。基于提出的选择性平衡采样策略，Tally通过将一个示例的语义表示与域相关的另一个滋扰来实现这一目标，从而产生新的表示形式，以用作数据增强。为了改善语义表示的分离，Tally进一步利用了一个域不变的类原型，该原型平均会产生域特异性效应。我们在几个基准和现实世界数据集上进行评估，发现它在子群和域移动中始终优于其他最新方法。我们的代码和数据已在https://github.com/huaxiuyao/tally上发布。

There is an inescapable long-tailed class-imbalance issue in many real-world classification problems. Current methods for addressing this problem only consider scenarios where all examples come from the same distribution. However, in many cases, there are multiple domains with distinct class imbalance. We study this multi-domain long-tailed learning problem and aim to produce a model that generalizes well across all classes and domains. Towards that goal, we introduce TALLY, a method that addresses this multi-domain long-tailed learning problem. Built upon a proposed selective balanced sampling strategy, TALLY achieves this by mixing the semantic representation of one example with the domain-associated nuisances of another, producing a new representation for use as data augmentation. To improve the disentanglement of semantic representations, TALLY further utilizes a domain-invariant class prototype that averages out domain-specific effects. We evaluate TALLY on several benchmarks and real-world datasets and find that it consistently outperforms other state-of-the-art methods in both subpopulation and domain shift. Our code and data have been released at https://github.com/huaxiuyao/TALLY.

下载PDF全文

下载文献需遵守相关版权规定

论文标题