论文标题

无监督的一致视频卡通化,具有感知运动一致性

Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency

论文作者

Liu, Zhenhuan, Li, Liang, Jiang, Huajie, Jin, Xin, Tu, Dandan, Wang, Shuhui, Zha, Zheng-Jun

论文摘要

近年来,风格转移和神经照片编辑等创意内容引起了越来越多的关注。其中,现实世界中的漫画化在娱乐和行业方面具有有希望的应用。与重点是改善生成图像的样式效果的图像翻译不同,视频卡通化对时间一致性有其他要求。在本文中,我们提出了一个以无监督的方式对一致的视频卡通化的空间自适应语义对齐框架,具有感知运动的一致性。语义比对模块旨在恢复由编码器decoder架构中丢失的空间信息引起的语义结构的变形。此外,我们将时空相关图设计为一种独立于风格的,全球意识的正规化,对感知运动的一致性。从照片和卡通框架中高级特征的相似性测量中,它捕获了光流中原始像素值以外的全局语义信息。此外,相似性测量值分散了与域特异性样式属性的时间关系,这有助于使时间一致性正常,而无需损害卡通图像的样式效果。定性和定量实验证明我们的方法能够生成高度风格和时间一致的卡通视频。

In recent years, creative content generations like style transfer and neural photo editing have attracted more and more attention. Among these, cartoonization of real-world scenes has promising applications in entertainment and industry. Different from image translations focusing on improving the style effect of generated images, video cartoonization has additional requirements on the temporal consistency. In this paper, we propose a spatially-adaptive semantic alignment framework with perceptual motion consistency for coherent video cartoonization in an unsupervised manner. The semantic alignment module is designed to restore deformation of semantic structure caused by spatial information lost in the encoder-decoder architecture. Furthermore, we devise the spatio-temporal correlative map as a style-independent, global-aware regularization on the perceptual motion consistency. Deriving from similarity measurement of high-level features in photo and cartoon frames, it captures global semantic information beyond raw pixel-value in optical flow. Besides, the similarity measurement disentangles temporal relationships from domain-specific style properties, which helps regularize the temporal consistency without hurting style effects of cartoon images. Qualitative and quantitative experiments demonstrate our method is able to generate highly stylistic and temporal consistent cartoon videos.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源