视觉变形金刚用于飞行的图像

论文标题

视觉变形金刚用于飞行的图像

Vision Transformers for Single Image Dehazing

论文作者

Song, Yuda, He, Zhuqing, Qian, Hui, Du, Xin

论文摘要

Dimage Dehazing是一项代表性的低级视觉任务，可从朦胧的图像中估算无雾图的图像。近年来，基于卷积神经网络的方法已占主导地位。但是，最近在高级视觉任务中取得了突破的视觉变形金刚并没有带来新的尺寸来形象地图。我们从受欢迎的Swin Transformer开始，发现其几种关键设计不适合图像除尘。为此，我们提出了DeHazeFormer，其中包括各种改进，例如修改的归一化层，激活函数和空间信息聚合方案。我们在各种数据集上训练多个Dehazeformer的多种变体，以证明其有效性。具体而言，在最常用的SOTS室内集合中，我们的小型模型的表现优于FFA-NET，只有25％#PARAM和5％的计算成本。据我们所知，我们的大型模型是SOTS室内集合超过40 dB的第一种方法，它极大地超过了先前的最新方法。我们还收集了一个大规模逼真的遥感除尘数据集，以评估该方法去除高度非均匀雾度的能力。

Image dehazing is a representative low-level vision task that estimates latent haze-free images from hazy images. In recent years, convolutional neural network-based methods have dominated image dehazing. However, vision Transformers, which has recently made a breakthrough in high-level vision tasks, has not brought new dimensions to image dehazing. We start with the popular Swin Transformer and find that several of its key designs are unsuitable for image dehazing. To this end, we propose DehazeFormer, which consists of various improvements, such as the modified normalization layer, activation function, and spatial information aggregation scheme. We train multiple variants of DehazeFormer on various datasets to demonstrate its effectiveness. Specifically, on the most frequently used SOTS indoor set, our small model outperforms FFA-Net with only 25% #Param and 5% computational cost. To the best of our knowledge, our large model is the first method with the PSNR over 40 dB on the SOTS indoor set, dramatically outperforming the previous state-of-the-art methods. We also collect a large-scale realistic remote sensing dehazing dataset for evaluating the method's capability to remove highly non-homogeneous haze.

下载PDF全文

下载文献需遵守相关版权规定

论文标题