多人3D姿势估计的压缩体积热图

论文标题

多人3D姿势估计的压缩体积热图

Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation

论文作者

Fabbri, Matteo, Lanzi, Fabio, Calderara, Simone, Alletto, Stefano, Cucchiara, Rita

论文摘要

在本文中，我们提出了一种从单眼RGB图像中估算自下而上的多人姿势的新型方法。我们建议使用高分辨率的体积热图来对关节位置进行建模，从而设计一种简单有效的压缩方法，以大大降低该表示的大小。我们所提出的方法的核心是我们的体积热图自动编码器，这是一个全面的跨跨网络，负责将地面真实热图压缩为密集的中间表示。然后，对第二个模型进行了代码预测器，以预测这些代码，可以在测试时间对其进行解压缩以重新观察原始表示形式。我们的实验评估表明，与在多人和单人3D人类姿势估计数据集上的最新技术相比，我们的方法的性能是有利的，并且由于我们的新型压缩策略，无论场景中的受试者数量如何，都可以在8 fps的恒定运行时处理全HD图像。 https://github.com/fabbrimatteo/loco上可用的代码和模型。

In this paper we present a novel approach for bottom-up multi-person 3D human pose estimation from monocular RGB images. We propose to use high resolution volumetric heatmaps to model joint locations, devising a simple and effective compression method to drastically reduce the size of this representation. At the core of the proposed method lies our Volumetric Heatmap Autoencoder, a fully-convolutional network tasked with the compression of ground-truth heatmaps into a dense intermediate representation. A second model, the Code Predictor, is then trained to predict these codes, which can be decompressed at test time to re-obtain the original representation. Our experimental evaluation shows that our method performs favorably when compared to state of the art on both multi-person and single-person 3D human pose estimation datasets and, thanks to our novel compression strategy, can process full-HD images at the constant runtime of 8 fps regardless of the number of subjects in the scene. Code and models available at https://github.com/fabbrimatteo/LoCO .

下载PDF全文

下载文献需遵守相关版权规定

论文标题