论文标题
细粒度检索提示调整
Fine-grained Retrieval Prompt Tuning
论文作者
论文摘要
细颗粒的对象检索旨在学习判别性表示以检索视觉上相似的对象。但是,现有的表现最佳作品通常会在语义嵌入空间上施加成对的相似性,或者设计本地化子网络,以在有限的数据方案中不断微调整个模型,从而导致与次优的解决方案的收敛。在本文中,我们开发了细粒的检索及时调整(FRPT),该调整从样本提示和功能适应的角度来处理冻结的预训练模型,以执行细粒度的检索任务。具体而言,FRPT只需要在提示中学习更少的参数并进行适应,而不是对整个模型进行微调,从而解决了通过微调整个模型引起的互惠问题的问题。从技术上讲,引入和视为歧视性扰动提示(DPP)作为样本提示过程,该过程放大甚至夸大了一些通过内容感知的不均匀抽样操作对类别预测有助于类别预测的歧视性元素。通过这种方式,DPP可以通过扰动提示在原始预训练期间接近已解决的任务来帮助获得细粒度的检索任务。因此,它保留了从输入样本中提取的表示形式的概括和歧视。此外,提出了一个特定类别的意识头,并将其视为特征适应,它使用类别指导的实例归一化消除了预先训练模型提取的特征中的物种差异。因此,它使优化功能仅包括子类别之间的差异。广泛的实验表明,我们的FRPT具有较少的可学习参数可以在三个广泛使用的细粒数据集上实现最新性能。
Fine-grained object retrieval aims to learn discriminative representation to retrieve visually similar objects. However, existing top-performing works usually impose pairwise similarities on the semantic embedding spaces or design a localization sub-network to continually fine-tune the entire model in limited data scenarios, thus resulting in convergence to suboptimal solutions. In this paper, we develop Fine-grained Retrieval Prompt Tuning (FRPT), which steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompting and feature adaptation. Specifically, FRPT only needs to learn fewer parameters in the prompt and adaptation instead of fine-tuning the entire model, thus solving the issue of convergence to suboptimal solutions caused by fine-tuning the entire model. Technically, a discriminative perturbation prompt (DPP) is introduced and deemed as a sample prompting process, which amplifies and even exaggerates some discriminative elements contributing to category prediction via a content-aware inhomogeneous sampling operation. In this way, DPP can make the fine-grained retrieval task aided by the perturbation prompts close to the solved task during the original pre-training. Thereby, it preserves the generalization and discrimination of representation extracted from input samples. Besides, a category-specific awareness head is proposed and regarded as feature adaptation, which removes the species discrepancies in features extracted by the pre-trained model using category-guided instance normalization. And thus, it makes the optimized features only include the discrepancies among subcategories. Extensive experiments demonstrate that our FRPT with fewer learnable parameters achieves the state-of-the-art performance on three widely-used fine-grained datasets.