通过视觉领域的镜头对OMNI-Vision表示基准测试

论文标题

通过视觉领域的镜头对OMNI-Vision表示基准测试

Benchmarking Omni-Vision Representation through the Lens of Visual Realms

论文作者

Zhang, Yuanhan, Yin, Zhenfei, Shao, Jing, Liu, Ziwei

论文摘要

尽管在特定的视觉领域（例如面部，狗和地方）取得了令人印象深刻的表现，但非常需要对许多天然视觉域的全面表示。但是，现有的基准是偏见且效率低下以评估Omni-Vision表示形式 - 这些基准测试仅包括几个特定领域，或者以付出了许多具有广泛的领域重叠的数据集为代价。在本文中，我们提出了Omni-Realm基准（Omnibench Markch）。它包括21个领域的数据集，具有7,372个概念和1,074,346张图像。在没有语义重叠的情况下，这些数据集全面且有效地涵盖了大多数视觉领域。此外，我们提出了一个新的有监督的对比学习框架，即关系对比学习（RECO），以提供更好的Omni-Vision代表。除了从同一概念中拉出两个实例（典型的监督对比学习框架），RECO还从相同的语义领域中提取了两个实例，从而编码了概念之间的语义关系，并促进了Omni-Vision表示学习。我们在Omni-Vision代表研究中进行了重新测试和其他进展，这些研究在体系结构（从CNN到变形金刚）和学习范式（从监督学习到自学学习的学习）上不同。我们说明了RECO与其他受监督的对比学习方法的上级，并揭示了多种实际观察，以促进未来的研究。

Though impressive performance has been achieved in specific visual realms (e.g. faces, dogs, and places), an omni-vision representation generalizing to many natural visual domains is highly desirable. But, existing benchmarks are biased and inefficient to evaluate the omni-vision representation -- these benchmarks either only include several specific realms, or cover most realms at the expense of subsuming numerous datasets that have extensive realm overlapping. In this paper, we propose Omni-Realm Benchmark (OmniBenchmark). It includes 21 realm-wise datasets with 7,372 concepts and 1,074,346 images. Without semantic overlapping, these datasets cover most visual realms comprehensively and meanwhile efficiently. In addition, we propose a new supervised contrastive learning framework, namely Relational Contrastive learning (ReCo), for a better omni-vision representation. Beyond pulling two instances from the same concept closer -- the typical supervised contrastive learning framework -- ReCo also pulls two instances from the same semantic realm closer, encoding the semantic relation between concepts, and facilitating omni-vision representation learning. We benchmark ReCo and other advances in omni-vision representation studies that are different in architectures (from CNNs to transformers) and in learning paradigms (from supervised learning to self-supervised learning) on OmniBenchmark. We illustrate the superior of ReCo to other supervised contrastive learning methods and reveal multiple practical observations to facilitate future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题