论文标题
HIFECAP:单眼高保真和表达人类表演
HiFECap: Monocular High-Fidelity and Expressive Capture of Human Performances
论文作者
论文摘要
单眼3D人力绩效捕获对于计算机图形和愿景中的许多应用是必不可少的,以实现沉浸式体验。但是,人类的详细捕获需要跟踪多个方面,包括骨骼姿势,动态表面,其中包括衣服,手势以及面部表情。现有的单眼方法允许对所有这些组件进行关节跟踪。为此,我们提出了HifeCap,这是一种新的神经人类性能捕获方法,同时捕获了单个RGB视频中的人类姿势,衣服,面部表情和手。我们证明了我们提出的网络体系结构,精心设计的培训策略以及将参数面部模型与模板网格紧密整合,使得可以捕获所有这些各个方面。重要的是,我们的方法还捕获了高频细节,例如在衣服上的皱纹,比以前的作品更好。此外,我们表明,HifeCap的表现优于最先进的人类绩效捕获的方法在定性和定量上,而首次捕获人类的所有方面。
Monocular 3D human performance capture is indispensable for many applications in computer graphics and vision for enabling immersive experiences. However, detailed capture of humans requires tracking of multiple aspects, including the skeletal pose, the dynamic surface, which includes clothing, hand gestures as well as facial expressions. No existing monocular method allows joint tracking of all these components. To this end, we propose HiFECap, a new neural human performance capture approach, which simultaneously captures human pose, clothing, facial expression, and hands just from a single RGB video. We demonstrate that our proposed network architecture, the carefully designed training strategy, and the tight integration of parametric face and hand models to a template mesh enable the capture of all these individual aspects. Importantly, our method also captures high-frequency details, such as deforming wrinkles on the clothes, better than the previous works. Furthermore, we show that HiFECap outperforms the state-of-the-art human performance capture approaches qualitatively and quantitatively while for the first time capturing all aspects of the human.