重新考虑未配对图像到图像翻译中内容约束的范式

论文标题

重新考虑未配对图像到图像翻译中内容约束的范式

Rethinking the Paradigm of Content Constraints in Unpaired Image-to-Image Translation

论文作者

Cai, Xiuding, Zhu, Yaoyao, Miao, Dong, Fu, Linjie, Yao, Yu

论文摘要

在未配对的环境中，缺乏图像到图像翻译（I2I）任务的足够内容约束，基于GAN的方法通常容易模型塌陷。当前的解决方案可以分为两类，基于重建和基于暹罗网络。前者要求转换或转换的图像可以完美地转换回原始图像，这有时太严格了，并限制了生成性能。后者涉及将原始图像和生成的图像馈入特征提取器，然后匹配其输出。这还不够有效，并且不容易获得通用功能提取器。在本文中，我们提出了一种简单但有效的方法，可以通过从\ textbf {en}编码器和生成器的de \ textbf {co} der的同一阶段从patch级特征的潜在特征的潜在空间中的表示相似性来维护内容。对于相似性函数，我们使用简单的MSE损失而不是对比度损失，该损失目前已在I2I任务中广泛使用。从设计中受益，Enco培训非常有效，而编码器的功能对解码产生了更积极的影响，从而导致了更令人满意的一代。此外，我们重新考虑了歧视者在抽样贴片中所起的作用，并提出了歧视性注意引导（DAG）斑块采样策略以替代随机抽样。 DAG是无参数的，仅需要可忽略的计算开销，同时显着改善了模型的性能。在多个数据集上进行的广泛实验证明了Enco的有效性和优势，与以前的方法相比，我们实现了多个最新的实验。我们的代码可在https://github.com/xiudingcai/enco-pytorch上找到。

In an unpaired setting, lacking sufficient content constraints for image-to-image translation (I2I) tasks, GAN-based approaches are usually prone to model collapse. Current solutions can be divided into two categories, reconstruction-based and Siamese network-based. The former requires that the transformed or transforming image can be perfectly converted back to the original image, which is sometimes too strict and limits the generative performance. The latter involves feeding the original and generated images into a feature extractor and then matching their outputs. This is not efficient enough, and a universal feature extractor is not easily available. In this paper, we propose EnCo, a simple but efficient way to maintain the content by constraining the representational similarity in the latent space of patch-level features from the same stage of the \textbf{En}coder and de\textbf{Co}der of the generator. For the similarity function, we use a simple MSE loss instead of contrastive loss, which is currently widely used in I2I tasks. Benefits from the design, EnCo training is extremely efficient, while the features from the encoder produce a more positive effect on the decoding, leading to more satisfying generations. In addition, we rethink the role played by discriminators in sampling patches and propose a discriminative attention-guided (DAG) patch sampling strategy to replace random sampling. DAG is parameter-free and only requires negligible computational overhead, while significantly improving the performance of the model. Extensive experiments on multiple datasets demonstrate the effectiveness and advantages of EnCo, and we achieve multiple state-of-the-art compared to previous methods. Our code is available at https://github.com/XiudingCai/EnCo-pytorch.

下载PDF全文

下载文献需遵守相关版权规定

论文标题