ZITS ++：通过改善结构先验的增量变压器来介绍图像

论文标题

ZITS ++：通过改善结构先验的增量变压器来介绍图像

ZITS++: Image Inpainting by Improving the Incremental Transformer on Structural Priors

论文作者

Cao, Chenjie, Dong, Qiaole, Fu, Yanwei

论文摘要

图像介绍涉及填补损坏图像的缺失区域。尽管最近取得了令人印象深刻的结果，但还恢复具有生动纹理和合理结构的图像仍然是一个重大挑战。先前的方法主要解决了正常纹理，同时由于卷积神经网络（CNN）的接受田而忽略了整体结构。为此，我们研究了在结构先验（ZITS ++）上学习零定位的基于剩余的增量变压器，这是我们会议工作的改进模型，ZITS。具体而言，给定一个损坏的图像，我们介绍了变压器结构修复器（TSR）模块，以恢复低图像分辨率的整体结构先验，这些结构分辨率通过简单结构UPSAMPLER（SSU）模块进一步绘制到更高的图像分辨率。为了恢复图像纹理细节，我们使用傅立叶CNN纹理恢复（FTR）模块，该模块通过傅立叶和大内核注意力卷积增强。此外，为了增强FTR，从结构特征编码器（SFE）进一步处理了来自TSR的UP采样的结构先验，并通过零定位的残留添加（Zerora）进行了优化。此外，还提出了一种新的掩蔽位置编码来编码大型不规则面膜。与ZITS相比，ZITS ++通过多种技术提高了FTR的稳定性和介入能力。更重要的是，我们全面探讨了各种图像先验对介质的影响，并研究了如何利用它们来解决通过广泛的实验来解决高分辨率图像。这项调查与大多数介绍方法是正交的，因此可以显着使社区受益。代码和模型将在https://github.com/ewrfcas/zits-plusplus中发布。

Image inpainting involves filling missing areas of a corrupted image. Despite impressive results have been achieved recently, restoring images with both vivid textures and reasonable structures remains a significant challenge. Previous methods have primarily addressed regular textures while disregarding holistic structures due to the limited receptive fields of Convolutional Neural Networks (CNNs). To this end, we study learning a Zero-initialized residual addition based Incremental Transformer on Structural priors (ZITS++), an improved model upon our conference work, ZITS. Specifically, given one corrupt image, we present the Transformer Structure Restorer (TSR) module to restore holistic structural priors at low image resolution, which are further upsampled by Simple Structure Upsampler (SSU) module to higher image resolution. To recover image texture details, we use the Fourier CNN Texture Restoration (FTR) module, which is strengthened by Fourier and large-kernel attention convolutions. Furthermore, to enhance the FTR, the upsampled structural priors from TSR are further processed by Structure Feature Encoder (SFE) and optimized with the Zero-initialized Residual Addition (ZeroRA) incrementally. Besides, a new masking positional encoding is proposed to encode the large irregular masks. Compared with ZITS, ZITS++ improves the FTR's stability and inpainting ability with several techniques. More importantly, we comprehensively explore the effects of various image priors for inpainting and investigate how to utilize them to address high-resolution image inpainting with extensive experiments. This investigation is orthogonal to most inpainting approaches and can thus significantly benefit the community. Codes and models will be released in https://github.com/ewrfcas/ZITS-PlusPlus.

下载PDF全文

下载文献需遵守相关版权规定

论文标题