论文标题
MISF:高保真图像插入的多级交互式暹罗滤波
MISF: Multi-level Interactive Siamese Filtering for High-Fidelity Image Inpainting
论文作者
论文摘要
尽管取得了重大进展,但由于在不同场景中的概括较低,现有的深层生成覆盖方法远非现实世界应用。结果,生成的图像通常包含伪影,或者填充的像素与地面真相有很大不同。图像级预测过滤是一种广泛使用的图像恢复技术,可根据不同输入场景适应合适的内核。受这个固有的优势的启发,我们探索了将图像作为过滤任务的可能性。为此,我们首先研究图像插入图像级预测过滤的优势和挑战:该方法可以保留本地结构并避免伪像,但无法填补大型缺失区域。然后,我们通过在深层特征级别进行过滤来提出语义过滤,从而填充缺失的语义信息,但无法恢复细节。为了解决各自优势的问题,我们提出了一种新颖的过滤技术,即多级交互式暹罗滤波(MISF),其中包含两个分支:内核预测分支(KPB)和语义和图像滤波分支(SIFB)。这两个分支与交互式链接:SIFB为KPB提供了多级特征,而KPB预测了SIFB的动态内核。结果,最终方法利用了有效的语义和图像级填充,以进行高保真介绍。我们在三个具有挑战性的数据集(即Dunhuang,Place2和Celeba)上验证方法。我们的方法的表现优于四个指标,即L1,PSNR,SSIM和LPIPS的最先进基准。请在https://github.com/tsingqguo/misf上尝试发布的代码和模型。
Although achieving significant progress, existing deep generative inpainting methods are far from real-world applications due to the low generalization across different scenes. As a result, the generated images usually contain artifacts or the filled pixels differ greatly from the ground truth. Image-level predictive filtering is a widely used image restoration technique, predicting suitable kernels adaptively according to different input scenes. Inspired by this inherent advantage, we explore the possibility of addressing image inpainting as a filtering task. To this end, we first study the advantages and challenges of image-level predictive filtering for image inpainting: the method can preserve local structures and avoid artifacts but fails to fill large missing areas. Then, we propose semantic filtering by conducting filtering on the deep feature level, which fills the missing semantic information but fails to recover the details. To address the issues while adopting the respective advantages, we propose a novel filtering technique, i.e., Multilevel Interactive Siamese Filtering (MISF), which contains two branches: kernel prediction branch (KPB) and semantic & image filtering branch (SIFB). These two branches are interactively linked: SIFB provides multi-level features for KPB while KPB predicts dynamic kernels for SIFB. As a result, the final method takes the advantage of effective semantic & image-level filling for high-fidelity inpainting. We validate our method on three challenging datasets, i.e., Dunhuang, Places2, and CelebA. Our method outperforms state-of-the-art baselines on four metrics, i.e., L1, PSNR, SSIM, and LPIPS. Please try the released code and model at https://github.com/tsingqguo/misf.