ITSRN ++：连续屏幕内容图像超分辨率的更强，更好的隐式变压器网络

论文标题

ITSRN ++：连续屏幕内容图像超分辨率的更强，更好的隐式变压器网络

ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution

论文作者

Shen, Sheng, Yue, Huanjing, Yang, Jingyu, Li, Kun

论文摘要

如今，在线屏幕共享和远程合作变得无处不在。但是，在传输过程中，屏幕内容可能会被降采样和压缩，同时可以在大屏幕上显示，或者用户将放大以获取接收器侧的详细观察。因此，需要开发强大而有效的屏幕内容图像（SCI）超分辨率（SR）方法。我们观察到，体重共享的Upsmpler（例如反卷积或像素散装）可能对SCIS中的锋利和较薄的边缘有害，而固定尺度的UPSMPLER使其使其不灵活以适合各种尺寸的屏幕。为了解决这个问题，我们为连续SCI SR提出了一个隐式变压器网络（称为ITSRN ++）。具体而言，我们提出了一个基于调制的变压器作为UPSAMPLER，该变压器通过周期性的非线性函数在离散空间中调节像素功能，以生成连续像素的功能。为了增强提取的特征，我们进一步提出了增强的变压器作为特征提取主链，在该主链中，卷积和注意力分支的使用是相反的。此外，我们构建了一个大规模的科学数据集，以促进有关科幻SR的研究。九个数据集的实验结果表明，所提出的方法可实现SCI SR的最新性能（X3 SR的Swinir优于0.74 dB），并且对自然图像SR也很好。接受这项工作后，我们的代码和数据集将发布。

Nowadays, online screen sharing and remote cooperation are becoming ubiquitous. However, the screen content may be downsampled and compressed during transmission, while it may be displayed on large screens or the users would zoom in for detail observation at the receiver side. Therefore, developing a strong and effective screen content image (SCI) super-resolution (SR) method is demanded. We observe that the weight-sharing upsampler (such as deconvolution or pixel shuffle) could be harmful to sharp and thin edges in SCIs, and the fixed scale upsampler makes it inflexible to fit screens with various sizes. To solve this problem, we propose an implicit transformer network for continuous SCI SR (termed as ITSRN++). Specifically, we propose a modulation based transformer as the upsampler, which modulates the pixel features in discrete space via a periodic nonlinear function to generate features for continuous pixels. To enhance the extracted features, we further propose an enhanced transformer as the feature extraction backbone, where convolution and attention branches are utilized parallelly. Besides, we construct a large scale SCI2K dataset to facilitate the research on SCI SR. Experimental results on nine datasets demonstrate that the proposed method achieves state-of-the-art performance for SCI SR (outperforming SwinIR by 0.74 dB for x3 SR) and also works well for natural image SR. Our codes and dataset will be released upon the acceptance of this work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题