论文标题
使用对比度学习对图像数据集的无监督可视化
Unsupervised visualization of image datasets using contrastive learning
论文作者
论文摘要
基于最近的邻居图(例如T-SNE或UMAP)的可视化方法被广泛用于可视化高维数据。但是,这些方法只有在最近的邻居本身有意义的情况下才会产生有意义的结果。对于像素空间中代表的图像,情况并非如此,因为像素空间中的距离通常不会捕捉我们的相似性,因此邻居在语义上没有接近。基于对比度学习(例如SIMCLR)依靠数据增强来生成隐式邻居的自我监督方法可以通过自我监督的方法来规避这个问题,但是这些方法并不产生适合可视化的二维嵌入。在这里,我们提出了一种称为t-simcne的新方法,用于无监督的图像数据可视化。 T-Simcne结合了对比度学习和邻居嵌入的想法,并将参数映射从高维像素空间训练到两个维度。我们表明,由此产生的2D嵌入达到的分类精度可与最先进的高维SIMCLR表示相媲美,从而忠实地捕获语义关系。使用t-simcne,我们获得了CIFAR-10和CIFAR-100数据集的信息可视化,显示了丰富的群集结构并突出显示人工制品和离群值。
Visualization methods based on the nearest neighbor graph, such as t-SNE or UMAP, are widely used for visualizing high-dimensional data. Yet, these approaches only produce meaningful results if the nearest neighbors themselves are meaningful. For images represented in pixel space this is not the case, as distances in pixel space are often not capturing our sense of similarity and therefore neighbors are not semantically close. This problem can be circumvented by self-supervised approaches based on contrastive learning, such as SimCLR, relying on data augmentation to generate implicit neighbors, but these methods do not produce two-dimensional embeddings suitable for visualization. Here, we present a new method, called t-SimCNE, for unsupervised visualization of image data. T-SimCNE combines ideas from contrastive learning and neighbor embeddings, and trains a parametric mapping from the high-dimensional pixel space into two dimensions. We show that the resulting 2D embeddings achieve classification accuracy comparable to the state-of-the-art high-dimensional SimCLR representations, thus faithfully capturing semantic relationships. Using t-SimCNE, we obtain informative visualizations of the CIFAR-10 and CIFAR-100 datasets, showing rich cluster structure and highlighting artifacts and outliers.