论文标题
将无模式的人类识别作为零射门学习
Taking Modality-free Human Identification as Zero-shot Learning
论文作者
论文摘要
人类身份是事件检测,人跟踪和公共安全的重要主题。有人提出了许多用于人类识别的方法,例如面部识别,人重新识别和步态识别。通常,现有方法主要将查询图像分类为图像库集(I2i)中的特定身份。对于仅在广泛的视频监视应用程序(A2I或I2A)的情况下,仅提供查询或属性库集的文本描述,这是严重限制的。但是,很少有努力致力于无模态识别,即以可扩展的方式识别图库中的查询。在这项工作中,我们进行了初步尝试,并以可扩展的方式制定了这种新颖的无模态人类识别(称为MFHI)任务作为一种通用的零照片学习模型。同时,它能够通过学习每个身份的歧视性原型来弥合视觉和语义方式。此外,语义引导的空间注意力在视觉模态上执行,以获得具有较高全局类别级别和局部属性级别歧视的表示形式。最后,我们在两项常见的挑战性识别任务上设计和进行了广泛的实验,包括面部识别和人重新识别,这表明我们的方法在无模态人类识别上的表现优于各种最先进的方法。
Human identification is an important topic in event detection, person tracking, and public security. There have been numerous methods proposed for human identification, such as face identification, person re-identification, and gait identification. Typically, existing methods predominantly classify a queried image to a specific identity in an image gallery set (I2I). This is seriously limited for the scenario where only a textual description of the query or an attribute gallery set is available in a wide range of video surveillance applications (A2I or I2A). However, very few efforts have been devoted towards modality-free identification, i.e., identifying a query in a gallery set in a scalable way. In this work, we take an initial attempt, and formulate such a novel Modality-Free Human Identification (named MFHI) task as a generic zero-shot learning model in a scalable way. Meanwhile, it is capable of bridging the visual and semantic modalities by learning a discriminative prototype of each identity. In addition, the semantics-guided spatial attention is enforced on visual modality to obtain representations with both high global category-level and local attribute-level discrimination. Finally, we design and conduct an extensive group of experiments on two common challenging identification tasks, including face identification and person re-identification, demonstrating that our method outperforms a wide variety of state-of-the-art methods on modality-free human identification.