LPT：长尾及时调整图像分类

论文标题

LPT：长尾及时调整图像分类

LPT: Long-tailed Prompt Tuning for Image Classification

论文作者

Dong, Bowen, Zhou, Pan, Yan, Shuicheng, Zuo, Wangmeng

论文摘要

对于长尾分类，大多数作品通常在大规模数据集上预先一个很大的模型，然后微调整个模型以适应长尾数据。尽管很有希望，但对整个预验证的模型进行微调倾向于在计算和部署不同模型的不同任务中遭受高成本的损失，并削弱了对长尾数据的某些特征过度拟合的概括能力。为了减轻这些问题，我们提出了一种有效的长尾及时调整方法，以进行长尾分类。 LPT将几个可训练的提示引入了一个冷冻的经过验证的模型，以使其适应长尾数据。为了提高有效性，我们将提示分为两组：1）整个长尾数据集的共同提示，以学习一般功能并将预处理的模型调整为目标域； 2）特定于组的提示，为具有相似特征的样品收集特定于小组的特征，并以歧视能力为预验证的模型增强了能力。然后，我们设计了两个阶段训练范式来学习这些提示。在第1阶段，我们通过有监督的及时调整训练共享提示，以使预贴模型适应所需的长尾域。在第2阶段中，我们使用学习的共享提示作为查询，为小组特定提示集中的一组类似样本的小组选择一个小的匹配集，以挖掘这些相似样本的共同特征，然后通过双采样策略和非对称GCL损失优化这些提示。通过在修复预验证模型的同时，仅对一些提示进行微调，LPT可以通过存储一些提示来降低培训和部署成本，并享有预验证的模型的强大概括能力。实验表明，在各种长尾基准测试中，与以前的整个模型微调方法相比，LPT的额外参数仅约1.1％，其性能可比性，并且对域转移更强大。

For long-tailed classification, most works often pretrain a big model on a large-scale dataset, and then fine-tune the whole model for adapting to long-tailed data. Though promising, fine-tuning the whole pretrained model tends to suffer from high cost in computation and deployment of different models for different tasks, as well as weakened generalization ability for overfitting to certain features of long-tailed data. To alleviate these issues, we propose an effective Long-tailed Prompt Tuning method for long-tailed classification. LPT introduces several trainable prompts into a frozen pretrained model to adapt it to long-tailed data. For better effectiveness, we divide prompts into two groups: 1) a shared prompt for the whole long-tailed dataset to learn general features and to adapt a pretrained model into target domain; and 2) group-specific prompts to gather group-specific features for the samples which have similar features and also to empower the pretrained model with discrimination ability. Then we design a two-phase training paradigm to learn these prompts. In phase 1, we train the shared prompt via supervised prompt tuning to adapt a pretrained model to the desired long-tailed domain. In phase 2, we use the learnt shared prompt as query to select a small best matched set for a group of similar samples from the group-specific prompt set to dig the common features of these similar samples, then optimize these prompts with dual sampling strategy and asymmetric GCL loss. By only fine-tuning a few prompts while fixing the pretrained model, LPT can reduce training and deployment cost by storing a few prompts, and enjoys a strong generalization ability of the pretrained model. Experiments show that on various long-tailed benchmarks, with only ~1.1% extra parameters, LPT achieves comparable performance than previous whole model fine-tuning methods, and is more robust to domain-shift.

下载PDF全文

下载文献需遵守相关版权规定

论文标题