论文标题
无优化的gan“可置换性”
GAN "Steerability" without optimization
论文作者
论文摘要
最近的研究表明,在揭示预训练gan的潜在空间中的“转向”方向方面取得了巨大的成功。这些方向对应于语义上有意义的图像变换,例如Shift,缩放,颜色操纵),并且在GAN可以生成的所有类别中都具有相似的可解释效果。一些方法着眼于用户指定的转换,而另一些方法则以无监督的方式发现转换。但是,所有现有技术都依赖于优化程序来揭示这些方向,并且无法控制不同转换之间允许相互作用的程度。在本文中,我们表明可以直接从发电机的权重以封闭形式计算“转向”轨迹,而无需任何形式的训练或优化。这适用于用户规定的几何变换,以及无监督的发现更复杂的效果。我们的方法允许确定线性和非线性轨迹,并且比以前的方法具有许多优势。特别是,我们可以控制一个转换是否允许另一个转换为代价(例如,有或不允许翻译以保持对象为中心的缩放)。此外,我们可以确定轨迹的自然终点,这对应于可以在不产生降解的情况下应用转换的最大程度。最后,我们展示了如何在没有优化的情况下,即使在不同类别之间也可以实现图像之间的传输属性。
Recent research has shown remarkable success in revealing "steering" directions in the latent spaces of pre-trained GANs. These directions correspond to semantically meaningful image transformations e.g., shift, zoom, color manipulations), and have similar interpretable effects across all categories that the GAN can generate. Some methods focus on user-specified transformations, while others discover transformations in an unsupervised manner. However, all existing techniques rely on an optimization procedure to expose those directions, and offer no control over the degree of allowed interaction between different transformations. In this paper, we show that "steering" trajectories can be computed in closed form directly from the generator's weights without any form of training or optimization. This applies to user-prescribed geometric transformations, as well as to unsupervised discovery of more complex effects. Our approach allows determining both linear and nonlinear trajectories, and has many advantages over previous methods. In particular, we can control whether one transformation is allowed to come on the expense of another (e.g. zoom-in with or without allowing translation to keep the object centered). Moreover, we can determine the natural end-point of the trajectory, which corresponds to the largest extent to which a transformation can be applied without incurring degradation. Finally, we show how transferring attributes between images can be achieved without optimization, even across different categories.