超高分辨率图像插入的上下文残留汇总

论文标题

超高分辨率图像插入的上下文残留汇总

Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting

论文作者

Yi, Zili, Tang, Qiang, Azizi, Shekoofeh, Jang, Daesik, Xu, Zhan

论文摘要

最近，数据驱动的图像介绍方法取得了鼓舞人心的进步，影响了基本图像编辑任务，例如删除对象和损坏的图像修复。这些方法比经典方法更有效，但是，由于内存限制，它们只能处理低分辨率输入，通常小于1K。同时，用移动设备捕获的照片的分辨率高达8K。低分辨率染色结果的幼稚上采样只会产生较大但模糊的结果。鉴于在大型模糊图像中添加高频残差图像可以产生鲜明的结果，并富含细节和纹理。在此激励的情况下，我们提出了一种上下文残留聚合（CRA）机制，该机制可以通过加权从上下文贴片中加权汇总残留物来产生缺少内容的高频残差，因此只需要从网络中进行低分辨率预测。由于神经网络的卷积层仅需要在低分辨率输入和输出上运行，因此记忆和计算能力的成本得到了很好的抑制。此外，对高分辨率培训数据集的需求得到了缓解。在我们的实验中，我们在具有分辨率512x512的小图像上训练了提出的模型，并在高分辨率图像上进行推断，从而达到了引人注目的介绍质量。我们的模型可以为孔尺寸的绘制图像带有大小相当大的8K图像，这在以前的基于学习的方法方面非常棘手。我们进一步详细介绍了网络体系结构的轻重量设计，在GTX 1080 Ti GPU上实现了2K图像的实时性能。代码可在以下网址提供：ATLAS200DK/Sample-imageInpainpain-Hifill。

Recently data-driven image inpainting methods have made inspiring progress, impacting fundamental image editing tasks such as object removal and damaged image repairing. These methods are more effective than classic approaches, however, due to memory limitations they can only handle low-resolution inputs, typically smaller than 1K. Meanwhile, the resolution of photos captured with mobile devices increases up to 8K. Naive up-sampling of the low-resolution inpainted result can merely yield a large yet blurry result. Whereas, adding a high-frequency residual image onto the large blurry image can generate a sharp result, rich in details and textures. Motivated by this, we propose a Contextual Residual Aggregation (CRA) mechanism that can produce high-frequency residuals for missing contents by weighted aggregating residuals from contextual patches, thus only requiring a low-resolution prediction from the network. Since convolutional layers of the neural network only need to operate on low-resolution inputs and outputs, the cost of memory and computing power is thus well suppressed. Moreover, the need for high-resolution training datasets is alleviated. In our experiments, we train the proposed model on small images with resolutions 512x512 and perform inference on high-resolution images, achieving compelling inpainting quality. Our model can inpaint images as large as 8K with considerable hole sizes, which is intractable with previous learning-based approaches. We further elaborate on the light-weight design of the network architecture, achieving real-time performance on 2K images on a GTX 1080 Ti GPU. Codes are available at: Atlas200dk/sample-imageinpainting-HiFill.

下载PDF全文

下载文献需遵守相关版权规定

论文标题