阐明基于扩散的生成模型的设计空间

论文标题

阐明基于扩散的生成模型的设计空间

Elucidating the Design Space of Diffusion-Based Generative Models

论文作者

Karras, Tero, Aittala, Miika, Aila, Timo, Laine, Samuli

论文摘要

我们认为，基于扩散的生成模型的理论和实践目前是不必要的，并试图通过呈现一个明确分开具体设计选择的设计空间来纠正这种情况。这使我们能够确定对采样过程和培训过程的几个更改，以及分数网络的预处理。总之，我们的改进在集体条件环境中为CIFAR-10带来了新的最新FID，在无条件的设置中产生了1.97，比先前的设计更快地采样（每个图像的35个网络评估）。为了进一步证明其模块化性质，我们表明我们的设计变化极大地提高了以前工作的预培训得分网络可获得的效率和质量，包括改善了先前训练的Imagenet-64模型的FID，从2.07型号从2.07提高到近距离Sota 1.55，以及在我们对新的SOTA的改进后重新培养1.36的FID。

We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices. This lets us identify several changes to both the sampling and training processes, as well as preconditioning of the score networks. Together, our improvements yield new state-of-the-art FID of 1.79 for CIFAR-10 in a class-conditional setting and 1.97 in an unconditional setting, with much faster sampling (35 network evaluations per image) than prior designs. To further demonstrate their modular nature, we show that our design changes dramatically improve both the efficiency and quality obtainable with pre-trained score networks from previous work, including improving the FID of a previously trained ImageNet-64 model from 2.07 to near-SOTA 1.55, and after re-training with our proposed improvements to a new SOTA of 1.36.

下载PDF全文

下载文献需遵守相关版权规定

论文标题