Vecgan：图像到图像翻译，带有可解释的潜在方向

论文标题

Vecgan：图像到图像翻译，带有可解释的潜在方向

VecGAN: Image-to-Image Translation with Interpretable Latent Directions

论文作者

Dalva, Yusuf, Altindis, Said Fahri, Dundar, Aysegul

论文摘要

我们提出了Vecgan，这是一个图像到图像翻译框架，用于带有可解释潜在方向的面部属性编辑。面部属性编辑任务面临着精确属性编辑的挑战，具有可控的强度和保留图像的其他属性。对于此目标，我们通过潜在空间分解设计属性编辑，对于每个属性，我们学习了与其他属性正交的线性方向。另一个组件是变化的可控强度，标量值。在我们的框架中，可以通过投影从参考图像中对此标量进行采样或编码。我们的工作灵感来自固定预验证的gan的潜在空间分解作品。但是，尽管这些模型无法进行端到端训练，并且难以精确编辑编码的图像，但Vecgan受到了端到端的培训，可以训练于图像翻译任务，并成功地编辑了属性，同时保留了其他属性。我们的广泛实验表明，vecgan对本地和全球编辑的最先进进行了重大改进。

We propose VecGAN, an image-to-image translation framework for facial attribute editing with interpretable latent directions. Facial attribute editing task faces the challenges of precise attribute editing with controllable strength and preservation of the other attributes of an image. For this goal, we design the attribute editing by latent space factorization and for each attribute, we learn a linear direction that is orthogonal to the others. The other component is the controllable strength of the change, a scalar value. In our framework, this scalar can be either sampled or encoded from a reference image by projection. Our work is inspired by the latent space factorization works of fixed pretrained GANs. However, while those models cannot be trained end-to-end and struggle to edit encoded images precisely, VecGAN is end-to-end trained for image translation task and successful at editing an attribute while preserving the others. Our extensive experiments show that VecGAN achieves significant improvements over state-of-the-arts for both local and global edits.

下载PDF全文

下载文献需遵守相关版权规定

论文标题