属性引起的偏见消除了转导零射击学习

论文标题

属性引起的偏见消除了转导零射击学习

Attribute-Induced Bias Eliminating for Transductive Zero-Shot Learning

论文作者

Yao, Hantao, Min, Shaobo, Zhang, Yongdong, Xu, Changsheng

论文摘要

通过对齐关节嵌入空间中的视觉和语义信息来识别不看到的类别，可以识别看不见的类别。在转导ZSL中存在四种域偏差，即两个域之间的视觉偏见和语义偏见，以及在各自的可见和看不见的域中的两个视觉语义偏见，但现有的工作仅着眼于其中的部分，这在知识传递过程中导致了严重的语义歧义。为了解决上述问题，我们提出了一种新型属性诱导的偏置消除转导ZSL的（AIBE）模块。具体而言，对于两个域之间的视觉偏见，首先要利用均值老师模块来弥合两个域之间的视觉表示差异，并没有监督的学习和未标记的图像。然后，提出了一个注意图属性嵌入，以减少可见类别和看不见的类别之间的语义偏差，该类别利用图形操作捕获类别之间的语义关系。此外，为了减少所见域中的语义 - 视觉偏见，我们将每个类别的视觉中心对齐，而不是单个视觉数据点，以及相应的语义属性，这进一步保留了嵌入式空间中的语义关系。最后，对于看不见的域中的语义 - 视觉偏见，看不见的语义对齐约束旨在以无监督的方式对齐视觉和语义空间。对几个基准的评估证明了所提出的方法的有效性，例如获得82.8％/75.5％，97.1％/82.5％和73.2％/52.1％的传统/广义ZSL设置，分别用于Cub，Awa2和Sun数据集。

Transductive Zero-shot learning (ZSL) targets to recognize the unseen categories by aligning the visual and semantic information in a joint embedding space. There exist four kinds of domain biases in Transductive ZSL, i.e., visual bias and semantic bias between two domains and two visual-semantic biases in respective seen and unseen domains, but existing work only focuses on the part of them, which leads to severe semantic ambiguity during the knowledge transfer. To solve the above problem, we propose a novel Attribute-Induced Bias Eliminating (AIBE) module for Transductive ZSL. Specifically, for the visual bias between two domains, the Mean-Teacher module is first leveraged to bridge the visual representation discrepancy between two domains with unsupervised learning and unlabelled images. Then, an attentional graph attribute embedding is proposed to reduce the semantic bias between seen and unseen categories, which utilizes the graph operation to capture the semantic relationship between categories. Besides, to reduce the semantic-visual bias in the seen domain, we align the visual center of each category, instead of the individual visual data point, with the corresponding semantic attributes, which further preserves the semantic relationship in the embedding space. Finally, for the semantic-visual bias in the unseen domain, an unseen semantic alignment constraint is designed to align visual and semantic space in an unsupervised manner. The evaluations on several benchmarks demonstrate the effectiveness of the proposed method, e.g., obtaining the 82.8%/75.5%, 97.1%/82.5%, and 73.2%/52.1% for Conventional/Generalized ZSL settings for CUB, AwA2, and SUN datasets, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题