论文标题
人类运动扩散模型
Human Motion Diffusion Model
论文作者
论文摘要
自然和表现力的人类运动产生是计算机动画的圣杯。由于可能的运动,人类感知的敏感性以及准确描述它的困难,这是一项具有挑战性的任务。因此,当前的生成溶液要么表现力低或有限。扩散模型在其他领域已经表现出了显着的生成能力,由于其多对多的性质,人类运动的候选人有希望,但它们往往是渴望资源的且难以控制的。在本文中,我们引入了运动扩散模型(MDM),这是一种针对人类运动域的基于无分类器扩散模型的经过精心调整的生成模型。 MDM是基于变压器的,结合了运动文献的见解。在每个扩散步骤中,显着的设计选择是样本而不是噪声的预测。这有助于在运动的位置和速度(例如脚接触损失)上使用已建立的几何损失。正如我们所证明的那样,MDM是一种通用方法,可以实现不同的调理模式和不同的生成任务。我们表明,我们的模型接受了轻量级资源的培训,但在领先的基准测试方面取得了最先进的结果,以进行文本到动作和动作。 https://guytevet.github.io/mdm-page/。
Natural and expressive human motion generation is the holy grail of computer animation. It is a challenging task, due to the diversity of possible motion, human perceptual sensitivity to it, and the difficulty of accurately describing it. Therefore, current generative solutions are either low-quality or limited in expressiveness. Diffusion models, which have already shown remarkable generative capabilities in other domains, are promising candidates for human motion due to their many-to-many nature, but they tend to be resource hungry and hard to control. In this paper, we introduce Motion Diffusion Model (MDM), a carefully adapted classifier-free diffusion-based generative model for the human motion domain. MDM is transformer-based, combining insights from motion generation literature. A notable design-choice is the prediction of the sample, rather than the noise, in each diffusion step. This facilitates the use of established geometric losses on the locations and velocities of the motion, such as the foot contact loss. As we demonstrate, MDM is a generic approach, enabling different modes of conditioning, and different generation tasks. We show that our model is trained with lightweight resources and yet achieves state-of-the-art results on leading benchmarks for text-to-motion and action-to-motion. https://guytevet.github.io/mdm-page/ .