动漫风格识别的具有挑战性的基准

论文标题

动漫风格识别的具有挑战性的基准

A Challenging Benchmark of Anime Style Recognition

论文作者

Li, Haotang, Guo, Shengtao, Lyu, Kailin, Yang, Xiao, Chen, Tianchen, Zhu, Jianqing, Zeng, Huanqiang

论文摘要

给定两幅不同动漫角色的图像，动漫风格识别（ASR）旨在学习抽象绘画样式，以确定这两个图像是否来自同一作品，这是一个有趣但充满挑战的问题。与生物识别识别（例如面部识别，虹膜识别和人重新识别）不同，ASR遭受了更大的语义差距，但受到更少的关注。在本文中，我们提出了一个具有挑战性的ASR基准。首先，我们收集了一个大规模的ASR数据集（LSASRD），其中包含20,937张动漫作品的图像，每项作品至少具有十种不同的角色。除大规模外，LSASRD还包含一系列具有挑战性的因素，例如复杂的照明，各种姿势，戏剧色彩和夸张的成分。其次，我们设计了一个跨力协议来评估ASR性能，其中查询和画廊图像必须来自不同的角色来验证ASR模型是学习抽象绘画样式，而不是学习角色的歧视性特征。最后，我们应用了两种强大的人重新识别方法，即AGW和Transreid，以在LSASRD上构建基线性能。令人惊讶的是，最近的变压器模型（即TransReid）仅在LSASRD上获得42.24％的地图。因此，我们认为，巨大语义差距的ASR任务值得深入和长期研究。我们将在https://github.com/nkjcqvcpi/asr上打开数据集和代码。

Given two images of different anime roles, anime style recognition (ASR) aims to learn abstract painting style to determine whether the two images are from the same work, which is an interesting but challenging problem. Unlike biometric recognition, such as face recognition, iris recognition, and person re-identification, ASR suffers from a much larger semantic gap but receives less attention. In this paper, we propose a challenging ASR benchmark. Firstly, we collect a large-scale ASR dataset (LSASRD), which contains 20,937 images of 190 anime works and each work at least has ten different roles. In addition to the large-scale, LSASRD contains a list of challenging factors, such as complex illuminations, various poses, theatrical colors and exaggerated compositions. Secondly, we design a cross-role protocol to evaluate ASR performance, in which query and gallery images must come from different roles to validate an ASR model is to learn abstract painting style rather than learn discriminative features of roles. Finally, we apply two powerful person re-identification methods, namely, AGW and TransReID, to construct the baseline performance on LSASRD. Surprisingly, the recent transformer model (i.e., TransReID) only acquires a 42.24% mAP on LSASRD. Therefore, we believe that the ASR task of a huge semantic gap deserves deep and long-term research. We will open our dataset and code at https://github.com/nkjcqvcpi/ASR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题