论文标题
可变形的内核卷积网络,用于视频极端超分辨率
Deformable Kernel Convolutional Network for Video Extreme Super-Resolution
论文作者
论文摘要
视频超分辨率试图从相应的低分辨率版本中重建高分辨率的视频帧,近年来受到了越来越多的关注。大多数现有的方法选择使用可变形的卷积来暂时对齐相邻框架,并应用传统的空间注意机制(基于卷积)来增强重建的特征。但是,这种仅空间策略不能完全利用视频帧之间的时间依赖性。在本文中,我们提出了一种基于深度学习的新型VSR算法,称为可变形的内核空间注意网络(DKSAN)。得益于新设计的可变形内核卷积对齐(DKC_Align)和可变形的内核空间注意(DKSA)模块,DKSAN可以更好地利用空间和时间冗余,以促进不同层的信息传播。我们已经在AIM2020视频极端超级分辨率的挑战中测试了DKSAN,以达到16个比例因素的超级溶解视频。实验结果表明,与现有的VID3OC和INTVID数据集中的现有最新的EDVR相比,我们提出的DKSAN可以实现更好的主观和客观性能。
Video super-resolution, which attempts to reconstruct high-resolution video frames from their corresponding low-resolution versions, has received increasingly more attention in recent years. Most existing approaches opt to use deformable convolution to temporally align neighboring frames and apply traditional spatial attention mechanism (convolution based) to enhance reconstructed features. However, such spatial-only strategies cannot fully utilize temporal dependency among video frames. In this paper, we propose a novel deep learning based VSR algorithm, named Deformable Kernel Spatial Attention Network (DKSAN). Thanks to newly designed Deformable Kernel Convolution Alignment (DKC_Align) and Deformable Kernel Spatial Attention (DKSA) modules, DKSAN can better exploit both spatial and temporal redundancies to facilitate the information propagation across different layers. We have tested DKSAN on AIM2020 Video Extreme Super-Resolution Challenge to super-resolve videos with a scale factor as large as 16. Experimental results demonstrate that our proposed DKSAN can achieve both better subjective and objective performance compared with the existing state-of-the-art EDVR on Vid3oC and IntVID datasets.