他们在电视节目中重建3D人类和环境的那个

论文标题

他们在电视节目中重建3D人类和环境的那个

The One Where They Reconstructed 3D Humans and Environments in TV Shows

论文作者

Pavlakos, Georgios, Weber, Ethan, Tancik, Matthew, Kanazawa, Angjoo

论文摘要

电视节目描绘了各种各样的人类行为，并已广泛研究了它们成为许多应用程序的丰富数据来源的潜力。但是，大多数现有工作都集中在2D识别任务上。在本文中，我们观察到电视节目中有一定的持久性，即对环境和人类的重复，这使得该内容的3D重建成为可能。在这种见解的基础上，我们提出了一种自动方法，该方法在整个电视节目的整个季节中运行，并在3D中汇总信息；我们构建了环境，计算相机信息，静态3D场景结构和身体尺度信息的3D模型。然后，我们演示了这些信息如何充当丰富的3D环境，可以指导和改善3D人类姿势和位置在这些环境中的恢复。此外，我们表明，关于人类及其环境的推理在3D中可以实现广泛的下游应用：重新识别，凝视估计，摄影和图像编辑。我们将我们的方法应用于七个标志性电视节目的环境中，并对所提出的系统进行广泛的评估。

TV shows depict a wide variety of human behaviors and have been studied extensively for their potential to be a rich source of data for many applications. However, the majority of the existing work focuses on 2D recognition tasks. In this paper, we make the observation that there is a certain persistence in TV shows, i.e., repetition of the environments and the humans, which makes possible the 3D reconstruction of this content. Building on this insight, we propose an automatic approach that operates on an entire season of a TV show and aggregates information in 3D; we build a 3D model of the environment, compute camera information, static 3D scene structure and body scale information. Then, we demonstrate how this information acts as rich 3D context that can guide and improve the recovery of 3D human pose and position in these environments. Moreover, we show that reasoning about humans and their environment in 3D enables a broad range of downstream applications: re-identification, gaze estimation, cinematography and image editing. We apply our approach on environments from seven iconic TV shows and perform an extensive evaluation of the proposed system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题