论文标题
通过神经渲染中的静态视频中的自我监督的3D人姿势估计
Self-Supervised 3D Human Pose Estimation in Static Video Via Neural Rendering
论文作者
论文摘要
从2D图像中推断出3D人的姿势是计算机视觉领域中的一个具有挑战性且长期存在的问题,其中包括运动和医学的运动捕获,虚拟现实,监视或步态分析。我们为一种方法提供了一种估计包含一个人和静态背景的2D视频姿势的方法的初步结果,而无需任何手动地标注释。我们通过制定一个简单而有效的自学任务来实现这一目标:我们的模型需要从另一个时间点重建视频的随机框架,并构建一个转换后的人形模板的渲染图像。至关重要的是,为了优化,我们的基于射线铸造的渲染管道是完全可区分的,仅基于重建任务才能终端训练。
Inferring 3D human pose from 2D images is a challenging and long-standing problem in the field of computer vision with many applications including motion capture, virtual reality, surveillance or gait analysis for sports and medicine. We present preliminary results for a method to estimate 3D pose from 2D video containing a single person and a static background without the need for any manual landmark annotations. We achieve this by formulating a simple yet effective self-supervision task: our model is required to reconstruct a random frame of a video given a frame from another timepoint and a rendered image of a transformed human shape template. Crucially for optimisation, our ray casting based rendering pipeline is fully differentiable, enabling end to end training solely based on the reconstruction task.