Sketch3T：零射Sbir的测试时间培训

论文标题

Sketch3T：零射Sbir的测试时间培训

Sketch3T: Test-Time Training for Zero-Shot SBIR

论文作者

Sain, Aneeshan, Bhunia, Ayan Kumar, Potlapalli, Vaishnav, Chowdhury, Pinaki Nath, Xiang, Tao, Song, Yi-Zhe

论文摘要

基于零素描的图像检索通常要求使用训练有素的模型和看不见的类别。在本文中，我们提出要争辩说，根据定义，这种设置与草图的固有的抽象和主观性质不兼容，即模型可能会很好地转移到新类别，但不会理解不同测试时间分布中存在的草图。因此，我们扩展了zs-sbir，要求它转移到类别和草图分布。我们的关键贡献是一个测试时间培训范式，只能使用一个草图适应。由于没有配对的照片，因此我们将草图栅格 - 矢量重建模块用作自我监管的辅助任务。为了在测试时间更新期间保持受过训练的跨模式嵌入的保真度，我们设计了一种新型的基于元学习的训练范式，以了解这种辅助任务与歧视性学习的主要目标的模型更新之间的分离。大量的实验表明，由于提议的测试时间改编不仅转移到新类别，而且还适合新的素描样式，因此我们的模型表明了胜过最佳状态。

Zero-shot sketch-based image retrieval typically asks for a trained model to be applied as is to unseen categories. In this paper, we question to argue that this setup by definition is not compatible with the inherent abstract and subjective nature of sketches, i.e., the model might transfer well to new categories, but will not understand sketches existing in different test-time distribution as a result. We thus extend ZS-SBIR asking it to transfer to both categories and sketch distributions. Our key contribution is a test-time training paradigm that can adapt using just one sketch. Since there is no paired photo, we make use of a sketch raster-vector reconstruction module as a self-supervised auxiliary task. To maintain the fidelity of the trained cross-modal joint embedding during test-time update, we design a novel meta-learning based training paradigm to learn a separation between model updates incurred by this auxiliary task from those off the primary objective of discriminative learning. Extensive experiments show our model to outperform state of-the-arts, thanks to the proposed test-time adaption that not only transfers to new categories but also accommodates to new sketching styles.

下载PDF全文

下载文献需遵守相关版权规定

论文标题