第五位解决方案kaggle Google通用图像嵌入竞赛

论文标题

第五位解决方案kaggle Google通用图像嵌入竞赛

5th Place Solution to Kaggle Google Universal Image Embedding Competition

论文作者

Ota, Noriaki, Yokoi, Shingo, Yamaoka, Shinsuke

论文摘要

在本文中，我们介绍了我们的解决方案，该解决方案在2022年在Kaggle Google Universal Image嵌入竞争中排名第五。我们使用OpenClip置换库中的VIT-H视觉编码器作为骨架，并使用Arcface训练由批量纳入和线性层组成的头部模型。使用的数据集是products10k，gldv2，gpr1200和food101的子集。并将TTA应用于一部分图像也可以提高分数。通过这种方法，我们在公众方面的得分为0.684，在私人排行榜上获得0.688。我们的代码可用。 https://github.com/riron1206/kaggle-google-universal-image-embedding-competition-5th-place-solution

In this paper, we present our solution, which placed 5th in the kaggle Google Universal Image Embedding Competition in 2022. We use the ViT-H visual encoder of CLIP from the openclip repository as a backbone and train a head model composed of BatchNormalization and Linear layers using ArcFace. The dataset used was a subset of products10K, GLDv2, GPR1200, and Food101. And applying TTA for part of images also improves the score. With this method, we achieve a score of 0.684 on the public and 0.688 on the private leaderboard. Our code is available. https://github.com/riron1206/kaggle-Google-Universal-Image-Embedding-Competition-5th-Place-Solution

下载PDF全文

下载文献需遵守相关版权规定

论文标题