并非所有实例都同样贡献：实例自适应班级表示学习以进行几次视觉识别

论文标题

并非所有实例都同样贡献：实例自适应班级表示学习以进行几次视觉识别

Not All Instances Contribute Equally: Instance-adaptive Class Representation Learning for Few-Shot Visual Recognition

论文作者

Han, Mengya, Zhan, Yibing, Luo, Yong, Du, Bo, Hu, Han, Wen, Yonggang, Tao, Dacheng

论文摘要

很少有视觉识别是指识别一些标记实例中的新型视觉概念。通过将查询表示形式与类表征进行比较以预测查询实例的类别，许多少数射击的视觉识别方法采用了基于公制的元学习范式。但是，当前基于度量的方法通常平等地对待所有实例，因此通常会获得偏见的类表示，考虑到并非所有实例在汇总了类级表示的实例级表示时都同样重要。例如，某些实例可能包含非代表性信息，例如过多的背景和无关概念的信息，这使结果偏向。为了解决上述问题，我们提出了一个新型的基于公制的元学习框架，称为实例自动班级表示网络（ICRL-NET），以进行几次视觉识别。具体而言，我们开发了一个自适应实例重新平衡网络，具有在生成班级表示，通过学习和为不同实例分配自适应权重的相对意义在相应类的支持集中的相对意义时通过学习和分配自适应权重来解决有偏见的表示问题的能力。此外，我们设计了改进的双线性实例表示，并结合了两个新型的结构损失，即，阶层内实例聚类损失损失和阶层间表示区分损失，以进一步调节实例重新估算过程并完善类表示。我们对四个通常采用的几个基准测试：Miniimagenet，Tieredimagenet，Cifar-FS和FC100数据集进行了广泛的实验。与最先进的方法相比，实验结果证明了我们的ICRL-NET的优越性。

Few-shot visual recognition refers to recognize novel visual concepts from a few labeled instances. Many few-shot visual recognition methods adopt the metric-based meta-learning paradigm by comparing the query representation with class representations to predict the category of query instance. However, current metric-based methods generally treat all instances equally and consequently often obtain biased class representation, considering not all instances are equally significant when summarizing the instance-level representations for the class-level representation. For example, some instances may contain unrepresentative information, such as too much background and information of unrelated concepts, which skew the results. To address the above issues, we propose a novel metric-based meta-learning framework termed instance-adaptive class representation learning network (ICRL-Net) for few-shot visual recognition. Specifically, we develop an adaptive instance revaluing network with the capability to address the biased representation issue when generating the class representation, by learning and assigning adaptive weights for different instances according to their relative significance in the support set of corresponding class. Additionally, we design an improved bilinear instance representation and incorporate two novel structural losses, i.e., intra-class instance clustering loss and inter-class representation distinguishing loss, to further regulate the instance revaluation process and refine the class representation. We conduct extensive experiments on four commonly adopted few-shot benchmarks: miniImageNet, tieredImageNet, CIFAR-FS, and FC100 datasets. The experimental results compared with the state-of-the-art approaches demonstrate the superiority of our ICRL-Net.

下载PDF全文

下载文献需遵守相关版权规定

论文标题