论文标题

人力运动转移的身份保留的框架

An Identity-Preserved Framework for Human Motion Transfer

论文作者

Ma, Jingzhe, Zhang, Xiaoqing, Yu, Shiqi

论文摘要

人类运动转移(HMT)旨在通过模仿源对象的运动来为目标主体生成视频剪辑。尽管以前的方法在综合了良好质量视频方面取得了良好的效果,但他们从源和目标动作中却看不到个性化的运动信息,这对于生成视频中的运动的现实主义至关重要。为了解决这个问题,我们提出了一个新颖的身份保留的HMT网络,称为\ textit {idpres}。该网络是一种基于骨架的方法,它独特地结合了目标的个性化运动和骨骼信息以增强身份表示。这种集成显着增强了生成视频中运动的现实主义。我们的方法着重于细粒度的分解和运动的合成。为了提高潜在空间中的表示能力,并促进\ textit {idpres}的培训,我们介绍了三个培训方案。这些方案使\ textIt {idpres}同时删除不同的表示并准确控制它们,从而确保理想动作的综合。为了评估生成的视频中个性化运动信息的比例,我们是第一个引入一个新的定量度量标准,称为身份评分(\ textit {id-Score}),这是由于步态识别方法在捕获身份信息时的成功而动机。此外,我们收集了一个配对数据集,$ Dancer101 $,由来自公共领域的101个主题的独奏视频组成,提供了一个基准,以促使HMT方法的开发。广泛的实验表明,提出的\ textit {idpres}方法在重建精度,现实运动和身份保存方面超过了现有的最新技术。

Human motion transfer (HMT) aims to generate a video clip for the target subject by imitating the source subject's motion. Although previous methods have achieved good results in synthesizing good-quality videos, they lose sight of individualized motion information from the source and target motions, which is significant for the realism of the motion in the generated video. To address this problem, we propose a novel identity-preserved HMT network, termed \textit{IDPres}. This network is a skeleton-based approach that uniquely incorporates the target's individualized motion and skeleton information to augment identity representations. This integration significantly enhances the realism of movements in the generated videos. Our method focuses on the fine-grained disentanglement and synthesis of motion. To improve the representation learning capability in latent space and facilitate the training of \textit{IDPres}, we introduce three training schemes. These schemes enable \textit{IDPres} to concurrently disentangle different representations and accurately control them, ensuring the synthesis of ideal motions. To evaluate the proportion of individualized motion information in the generated video, we are the first to introduce a new quantitative metric called Identity Score (\textit{ID-Score}), motivated by the success of gait recognition methods in capturing identity information. Moreover, we collect an identity-motion paired dataset, $Dancer101$, consisting of solo-dance videos of 101 subjects from the public domain, providing a benchmark to prompt the development of HMT methods. Extensive experiments demonstrate that the proposed \textit{IDPres} method surpasses existing state-of-the-art techniques in terms of reconstruction accuracy, realistic motion, and identity preservation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源