用纸板人类建模进行多视图检测

论文标题

用纸板人类建模进行多视图检测

Multiview Detection with Cardboard Human Modeling

论文作者

Ma, Jiahao, Duan, Zicheng, Zheng, Liang, Nguyen, Chuong

论文摘要

Multiview检测使用多个校准摄像机，并带有重叠的视野来定位遮挡的行人。在该领域，现有方法通常采用``人类建模 - 聚合''策略。为了找到强大的行人表示形式，一些直观地结合了每个框架的2D感知结果，而另一些则使用投影到地面上的整个框架特征。但是，前者没有考虑人类的外观，并导致许多歧义，而后者由于缺乏人类躯干和头部的准确高度而遭受投影错误。在本文中，我们提出了一种基于人类点云建模的新行人代表方案。具体而言，使用射线跟踪进行整体人类深度估计，我们将行人建模为直立的，薄的纸板点云。然后，我们跨多个视图将行人纸板的点云汇总为最终决定。与现有表示形式相比，该建议的方法明确利用人类的外观并通过相对准确的高度估计大大减少投影误差。在四个标准评估基准上，提出的方法取得了非常具竞争力的结果。我们的代码和数据将在https://github.com/zichengduan/mvchm上发布。

Multiview detection uses multiple calibrated cameras with overlapping fields of views to locate occluded pedestrians. In this field, existing methods typically adopt a ``human modeling - aggregation'' strategy. To find robust pedestrian representations, some intuitively incorporate 2D perception results from each frame, while others use entire frame features projected to the ground plane. However, the former does not consider the human appearance and leads to many ambiguities, and the latter suffers from projection errors due to the lack of accurate height of the human torso and head. In this paper, we propose a new pedestrian representation scheme based on human point clouds modeling. Specifically, using ray tracing for holistic human depth estimation, we model pedestrians as upright, thin cardboard point clouds on the ground. Then, we aggregate the point clouds of the pedestrian cardboard across multiple views for a final decision. Compared with existing representations, the proposed method explicitly leverages human appearance and reduces projection errors significantly by relatively accurate height estimation. On four standard evaluation benchmarks, the proposed method achieves very competitive results. Our code and data will be released at https://github.com/ZichengDuan/MvCHM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题