将嵌入通用图像嵌入到通用图像中的统一专业图像

论文标题

将嵌入通用图像嵌入到通用图像中的统一专业图像

Unifying Specialist Image Embedding into Universal Image Embedding

论文作者

Feng, Yang, Peng, Futang, Zhang, Xu, Zhu, Wei, Zhang, Shanfeng, Zhou, Howard, Li, Zhen, Duerig, Tom, Chang, Shih-Fu, Luo, Jiebo

论文摘要

深图像嵌入提供了一种测量两个图像的语义相似性的方法。它在许多应用程序中起着核心作用，例如图像搜索，面部验证和零照片学习。希望拥有适用于图像各个域的通用深层嵌入模型。但是，现有方法主要依赖于培训专家嵌入模型，每种模型都适用于单个域中的图像。在本文中，我们研究了一项重要但尚未探索的任务：如何训练单个通用图像嵌入模型以匹配每个专家领域的几个专家的性能。简单地融合来自多个域的训练数据无法解决此问题，因为使用现有方法一起训练时，某些域变得过早地拟合。因此，我们建议将多个专家的知识提炼成一个普遍嵌入，以解决这个问题。与现有的嵌入蒸馏方法相比，我们将图像之间的绝对距离提炼出来，我们将图像之间的绝对距离转换为概率分布，并最大程度地减少专家和通用嵌入的分布之间的KL差异。使用几个公共数据集，我们验证我们提出的方法实现了通用图像嵌入的目标。

Deep image embedding provides a way to measure the semantic similarity of two images. It plays a central role in many applications such as image search, face verification, and zero-shot learning. It is desirable to have a universal deep embedding model applicable to various domains of images. However, existing methods mainly rely on training specialist embedding models each of which is applicable to images from a single domain. In this paper, we study an important but unexplored task: how to train a single universal image embedding model to match the performance of several specialists on each specialist's domain. Simply fusing the training data from multiple domains cannot solve this problem because some domains become overfitted sooner when trained together using existing methods. Therefore, we propose to distill the knowledge in multiple specialists into a universal embedding to solve this problem. In contrast to existing embedding distillation methods that distill the absolute distances between images, we transform the absolute distances between images into a probabilistic distribution and minimize the KL-divergence between the distributions of the specialists and the universal embedding. Using several public datasets, we validate that our proposed method accomplishes the goal of universal image embedding.

下载PDF全文

下载文献需遵守相关版权规定

论文标题