论文标题
姿势引导的人形象生成和动画的深空间转换
Deep Spatial Transformation for Pose-Guided Person Image Generation and Animation
论文作者
论文摘要
姿势指导的人形象产生和动画旨在将源形象转换为目标姿势。这些任务需要空间操纵源数据。但是,卷积神经网络受到空间转换输入的能力的限制。在本文中,我们提出了一个可区分的全球局部局部注意框架,以重新组装功能级别的输入。该框架首先估计来源和目标之间的全局流场。然后,用内容感知到的本地注意系数对相应的局部源特征补丁进行采样。我们表明,我们的框架可以有效地在空间上转换输入。同时,我们进一步对人图像动画任务的时间一致性进行了建模,以生成连贯的视频。图像生成和动画任务的实验结果证明了我们模型的优势。此外,新型视图合成和面部图像动画的其他结果表明,我们的模型适用于需要空间转换的其他任务。我们项目的源代码可在https://github.com/renyurui/global-flow-local-corteention上获得。
Pose-guided person image generation and animation aim to transform a source person image to target poses. These tasks require spatial manipulation of source data. However, Convolutional Neural Networks are limited by the lack of ability to spatially transform the inputs. In this paper, we propose a differentiable global-flow local-attention framework to reassemble the inputs at the feature level. This framework first estimates global flow fields between sources and targets. Then, corresponding local source feature patches are sampled with content-aware local attention coefficients. We show that our framework can spatially transform the inputs in an efficient manner. Meanwhile, we further model the temporal consistency for the person image animation task to generate coherent videos. The experiment results of both image generation and animation tasks demonstrate the superiority of our model. Besides, additional results of novel view synthesis and face image animation show that our model is applicable to other tasks requiring spatial transformation. The source code of our project is available at https://github.com/RenYurui/Global-Flow-Local-Attention.