一目了然：学习与弱小的人群计数的锚一起排名

论文标题

一目了然：学习与弱小的人群计数的锚一起排名

Glance to Count: Learning to Rank with Anchors for Weakly-supervised Crowd Counting

论文作者

Xiong, Zheng, Chai, Liangyu, Liu, Wenxi, Liu, Yongtuo, Ren, Sucheng, He, Shengfeng

论文摘要

人群形象可以说是注释最费力的数据之一。在本文中，我们致力于减少密集标记的人群数据的巨大需求，并提出了一种新颖的弱监督环境，在该环境中，我们利用具有高对比度人群的两个图像的二进制排名将其视为训练指南。为了在这种新环境下进行培训，我们将人群计数回归问题转换为排名潜在的预测问题。特别是，我们量身定制了一个暹罗排名网络，该网络预测了两个图像的潜在分数，表明计数排序。因此，最终目标是为所有人群图像分配适当的潜力，以确保其订单遵守排名标签。另一方面，潜力揭示了相对人群的大小，但不能产生确切的人群数量。我们通过在推理阶段引入“锚点”来解决此问题。具体而言，锚是一些图像，其计数标签用于通过简单的线性映射函数引用潜在分数的相应计数。我们进行了广泛的实验来研究各种监督的组合，我们表明所提出的方法的表现优于现有的弱监督方法，而无需大幅度的额外标记努力。

Crowd image is arguably one of the most laborious data to annotate. In this paper, we devote to reduce the massive demand of densely labeled crowd data, and propose a novel weakly-supervised setting, in which we leverage the binary ranking of two images with high-contrast crowd counts as training guidance. To enable training under this new setting, we convert the crowd count regression problem to a ranking potential prediction problem. In particular, we tailor a Siamese Ranking Network that predicts the potential scores of two images indicating the ordering of the counts. Hence, the ultimate goal is to assign appropriate potentials for all the crowd images to ensure their orderings obey the ranking labels. On the other hand, potentials reveal the relative crowd sizes but cannot yield an exact crowd count. We resolve this problem by introducing "anchors" during the inference stage. Concretely, anchors are a few images with count labels used for referencing the corresponding counts from potential scores by a simple linear mapping function. We conduct extensive experiments to study various combinations of supervision, and we show that the proposed method outperforms existing weakly-supervised methods without additional labeling effort by a large margin.

下载PDF全文

下载文献需遵守相关版权规定

论文标题