奇妙的风格渠道和在哪里可以找到它们：发现甘恩斯不同方向的subipular框架

论文标题

奇妙的风格渠道和在哪里可以找到它们：发现甘恩斯不同方向的subipular框架

Fantastic Style Channels and Where to Find Them: A Submodular Framework for Discovering Diverse Directions in GANs

论文作者

Simsar, Enis, Kocasari, Umut, Er, Ezgi Gülperi, Yanardag, Pinar

论文摘要

在预训练的GAN模型的潜在空间中发现了可解释的方向，最近已成为一个流行的话题。特别是，由于其丰富且分离的潜在空间，StyleGan2已启用了各种图像生成和操纵任务。这种说明的发现通常是以监督的方式进行的，这需要对每个所需的操作或以无监督的方式进行注释的数据，这需要手动努力来识别指示。结果，现有的工作通常只找到少数可以进行可控编辑的说明。在这项研究中，我们设计了一个新颖的子模型框架，该框架找到了StyleGAN 2的潜在空间中最具代表性和最多样化的方向子集。我们的方法利用了渠道样式参数的潜在空间，即所谓的样式空间，在该空间中，我们聚集了将类似的操作分组进行的通道。我们的框架通过使用簇的概念来促进多样性，并可以通过贪婪优化方案有效地解决。我们通过定性和定量实验评估我们的框架，并表明我们的方法发现了更多样化和分离的方向。我们的项目页面可以在http://catlab-team.github.io/fantasticstyles上找到。

The discovery of interpretable directions in the latent spaces of pre-trained GAN models has recently become a popular topic. In particular, StyleGAN2 has enabled various image generation and manipulation tasks due to its rich and disentangled latent spaces. The discovery of such directions is typically done either in a supervised manner, which requires annotated data for each desired manipulation or in an unsupervised manner, which requires a manual effort to identify the directions. As a result, existing work typically finds only a handful of directions in which controllable edits can be made. In this study, we design a novel submodular framework that finds the most representative and diverse subset of directions in the latent space of StyleGAN2. Our approach takes advantage of the latent space of channel-wise style parameters, so-called style space, in which we cluster channels that perform similar manipulations into groups. Our framework promotes diversity by using the notion of clusters and can be efficiently solved with a greedy optimization scheme. We evaluate our framework with qualitative and quantitative experiments and show that our method finds more diverse and disentangled directions. Our project page can be found at http://catlab-team.github.io/fantasticstyles.

下载PDF全文

下载文献需遵守相关版权规定

论文标题