半参数神经图像合成

论文标题

半参数神经图像合成

Semi-Parametric Neural Image Synthesis

论文作者

Blattmann, Andreas, Rombach, Robin, Oktay, Kaan, Müller, Jonas, Ommer, Björn

论文摘要

新型体系结构最近改善了生成图像的合成，从而在各种任务中促进了出色的视觉质量。这些成功的很大程度上是由于这些架构的可扩展性以及由于模型复杂性的急剧增加以及投资于培训这些模型的计算资源而引起的。我们的工作质疑将大型培训数据压缩为不断增长的参数表示的基本范式。我们宁愿提出一种正交的半参数方法。我们通过单独的图像数据库和检索策略来补充相当小的扩散或自回归模型。在培训期间，我们从此培训实例中从此外部数据库中检索了一组最近的邻居，并根据这些内容丰富的样本调节生成模型。尽管检索方法是提供（本地）内容，但该模型集中在基于此内容的情况下学习场景的组成。正如我们的实验所证明的那样，只需将数据库换成具有不同内容的数据库，即在事后将受过训练的模型转移到新的域即可。该评估显示了在未经培训的生成模型的任务上的竞争性能，例如类条件合成，零射击风格或文本对图像综合而无需配对的文本图像数据。对于外部数据库和检索的可忽略不计的内存和计算开销，我们可以显着降低生成模型的参数计数，并且仍然胜过最先进的。

Novel architectures have recently improved generative image synthesis leading to excellent visual quality in various tasks. Much of this success is due to the scalability of these architectures and hence caused by a dramatic increase in model complexity and in the computational resources invested in training these models. Our work questions the underlying paradigm of compressing large training data into ever growing parametric representations. We rather present an orthogonal, semi-parametric approach. We complement comparably small diffusion or autoregressive models with a separate image database and a retrieval strategy. During training we retrieve a set of nearest neighbors from this external database for each training instance and condition the generative model on these informative samples. While the retrieval approach is providing the (local) content, the model is focusing on learning the composition of scenes based on this content. As demonstrated by our experiments, simply swapping the database for one with different contents transfers a trained model post-hoc to a novel domain. The evaluation shows competitive performance on tasks which the generative model has not been trained on, such as class-conditional synthesis, zero-shot stylization or text-to-image synthesis without requiring paired text-image data. With negligible memory and computational overhead for the external database and retrieval we can significantly reduce the parameter count of the generative model and still outperform the state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题