论文标题
基于内存的基于INR的gan的基于记忆的培训
Memory Efficient Patch-based Training for INR-based GANs
论文作者
论文摘要
最近的研究表明,基于隐式神经表示(INR),gan的进展显着,这是一种MLP,鉴于其(x,y)坐标,该MLP产生RGB值。它们代表图像是基础2D信号的连续版本,而不是2D阵列的像素,该阵列为GAN应用程序打开了新的视野(例如,零击的超分辨率,图像支出)。但是,训练现有方法需要与图像分辨率成正比的大量计算成本,因为它们计算了每个(x,y)坐标的MLP操作。为了减轻此问题,我们提出了一种基于多阶段的培训,这是一种新颖且可扩展的方法,可以训练基于INR的gan具有灵活的计算成本,而不管图像分辨率如何。具体而言,我们的方法允许通过补丁产生和歧视图像的本地细节,并通过新颖的重建损失来学习全球结构信息,以实现有效的GAN培训。我们在几个基准数据集上进行实验,以证明我们的方法可以增强GPU内存中的基线模型,同时将FID保持在合理的水平。
Recent studies have shown remarkable progress in GANs based on implicit neural representation (INR) - an MLP that produces an RGB value given its (x, y) coordinate. They represent an image as a continuous version of the underlying 2D signal instead of a 2D array of pixels, which opens new horizons for GAN applications (e.g., zero-shot super-resolution, image outpainting). However, training existing approaches require a heavy computational cost proportional to the image resolution, since they compute an MLP operation for every (x, y) coordinate. To alleviate this issue, we propose a multi-stage patch-based training, a novel and scalable approach that can train INR-based GANs with a flexible computational cost regardless of the image resolution. Specifically, our method allows to generate and discriminate by patch to learn the local details of the image and learn global structural information by a novel reconstruction loss to enable efficient GAN training. We conduct experiments on several benchmark datasets to demonstrate that our approach enhances baseline models in GPU memory while maintaining FIDs at a reasonable level.