论文标题
深度度量学习中的大到小图像分辨率不对称
Large-to-small Image Resolution Asymmetry in Deep Metric Learning
论文作者
论文摘要
通过优化代表网络以映射(非)匹配图像对与(非)相似表示形式,对视力的深度度量学习进行了训练。在通常与图像检索相对应的测试过程中,数据库和查询示例均由同一网络处理,以获取用于相似性估计和排名的表示形式。在这项工作中,我们通过以小图像分辨率对查询的轻量处理来探索不对称的设置,以实现快速表示提取。目的是获取一个用于数据库示例的网络,该网络经过训练,可以在大分辨率图像细节中运行,并从细粒度的图像细节中受益,以及第二个用于查询示例的网络,该网络可在小分辨率图像上运行,但保留了与数据库网络一致的表示空间。我们通过一种蒸馏方法实现了这一目标,该方法通过损失将知识从固定的教师网络转移到学生,该损失无需使用任何标签即可依靠耦合的增强。与从不同网络体系结构的角度探索不对称的先前工作相反,此工作使用相同的体系结构,但会修改图像分辨率。我们得出的结论是,分辨率不对称是优化性能/效率折衷的更好方法,而不是建筑不对称。评估是对三个标准的深度度量学习基准测试,即CUB200,CARS196和SOP进行评估。代码:https://github.com/pavelsuma/raml
Deep metric learning for vision is trained by optimizing a representation network to map (non-)matching image pairs to (non-)similar representations. During testing, which typically corresponds to image retrieval, both database and query examples are processed by the same network to obtain the representation used for similarity estimation and ranking. In this work, we explore an asymmetric setup by light-weight processing of the query at a small image resolution to enable fast representation extraction. The goal is to obtain a network for database examples that is trained to operate on large resolution images and benefits from fine-grained image details, and a second network for query examples that operates on small resolution images but preserves a representation space aligned with that of the database network. We achieve this with a distillation approach that transfers knowledge from a fixed teacher network to a student via a loss that operates per image and solely relies on coupled augmentations without the use of any labels. In contrast to prior work that explores such asymmetry from the point of view of different network architectures, this work uses the same architecture but modifies the image resolution. We conclude that resolution asymmetry is a better way to optimize the performance/efficiency trade-off than architecture asymmetry. Evaluation is performed on three standard deep metric learning benchmarks, namely CUB200, Cars196, and SOP. Code: https://github.com/pavelsuma/raml