扩散模型已经具有语义潜在空间

论文标题

扩散模型已经具有语义潜在空间

Diffusion Models already have a Semantic Latent Space

论文作者

Kwon, Mingi, Jeong, Jaeseok, Uh, Youngjung

论文摘要

扩散模型在各个域中实现出色的生成性能。尽管他们取得了巨大的成功，但他们缺乏语义潜在空间，这对于控制生成过程至关重要。为了解决该问题，我们提出了不对称的反向过程（ASYRP），该过程发现了冷冻预处理的扩散模型中的语义潜在空间。我们的语义潜在空间，名为H空间，具有适应语义图像操纵的良好特性：均匀性，线性，鲁棒性和跨时间段的一致性。此外，我们还介绍了通过可量化措施的多功能编辑和质量提升的生成过程的原则性设计：在时间步长以间隔和质量缺乏的编辑强度。我们的方法适用于各种体系结构（DDPM ++，ID- DPM和ADM）和数据集（Celeba-HQ，AFHQ-DOG，LSUN-Church，Lsun-Church，Lsun-卧室和Metfaces）。项目页面：https：//kwonminki.github.io/asyrp/

Diffusion models achieve outstanding generative performance in various domains. Despite their great success, they lack semantic latent space which is essential for controlling the generative process. To address the problem, we propose asymmetric reverse process (Asyrp) which discovers the semantic latent space in frozen pretrained diffusion models. Our semantic latent space, named h-space, has nice properties for accommodating semantic image manipulation: homogeneity, linearity, robustness, and consistency across timesteps. In addition, we introduce a principled design of the generative process for versatile editing and quality boost ing by quantifiable measures: editing strength of an interval and quality deficiency at a timestep. Our method is applicable to various architectures (DDPM++, iD- DPM, and ADM) and datasets (CelebA-HQ, AFHQ-dog, LSUN-church, LSUN- bedroom, and METFACES). Project page: https://kwonminki.github.io/Asyrp/

下载PDF全文

下载文献需遵守相关版权规定

论文标题