论文标题

视力的有效扩散模型:调查

Efficient Diffusion Models for Vision: A Survey

论文作者

Ulhaq, Anwaar, Akhtar, Naveed

论文摘要

扩散模型(DMS)在不需要对抗训练的情况下证明了内容生成中最先进的性能。这些模型是使用两步过程训练的。首先,向前 - 扩散 - 过程逐渐向基准(通常是图像)添加噪声。然后,向后的 - 反向扩散 - 逐渐消除噪声以将其变成正在建模的目标分布的样本。 DMS灵感来自非平衡热力学,具有固有的高计算复杂性。由于在高维空间中进行了频繁的功能评估和梯度计算,因此这些模型在训练和推理阶段都会导致相当大的计算开销。这不仅可以排除基于扩散的建模的民主化,而且还阻碍了现实生活应用中扩散模型的适应。更不用说,由于能源消耗过多和环境恐惧,计算模型的效率正迅速成为一个重大问题。这些因素导致了文献中的多种贡献,这些贡献着重于设计计算高效的DMS。在这篇综述中,我们介绍了视觉扩散模型的最新进展,特别关注影响DMS计算效率的重要设计方面。特别是,我们强调了最近提出的设计选择,这些选择导致了更有效的DMS。从广泛的角度来看,与其他最近的评论讨论了扩散模型,该调查旨在通过突出文献中的设计策略来推动这一研究方向,这些设计策略为更广泛的研究社区提供了可行的模型。我们还从视觉中从其计算效率的角度提供了扩散模型的未来前景。

Diffusion Models (DMs) have demonstrated state-of-the-art performance in content generation without requiring adversarial training. These models are trained using a two-step process. First, a forward - diffusion - process gradually adds noise to a datum (usually an image). Then, a backward - reverse diffusion - process gradually removes the noise to turn it into a sample of the target distribution being modelled. DMs are inspired by non-equilibrium thermodynamics and have inherent high computational complexity. Due to the frequent function evaluations and gradient calculations in high-dimensional spaces, these models incur considerable computational overhead during both training and inference stages. This can not only preclude the democratization of diffusion-based modelling, but also hinder the adaption of diffusion models in real-life applications. Not to mention, the efficiency of computational models is fast becoming a significant concern due to excessive energy consumption and environmental scares. These factors have led to multiple contributions in the literature that focus on devising computationally efficient DMs. In this review, we present the most recent advances in diffusion models for vision, specifically focusing on the important design aspects that affect the computational efficiency of DMs. In particular, we emphasize the recently proposed design choices that have led to more efficient DMs. Unlike the other recent reviews, which discuss diffusion models from a broad perspective, this survey is aimed at pushing this research direction forward by highlighting the design strategies in the literature that are resulting in practicable models for the broader research community. We also provide a future outlook of diffusion models in vision from their computational efficiency viewpoint.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源