视觉及时调整生成转移学习

论文标题

视觉及时调整生成转移学习

Visual Prompt Tuning for Generative Transfer Learning

论文作者

Sohn, Kihyuk, Hao, Yuan, Lezama, José, Polania, Luisa, Chang, Huiwen, Zhang, Han, Essa, Irfan, Jiang, Lu

论文摘要

从大型数据集中训练的图像合成模型转移知识是有效地从各个领域学习生成图像模型的有希望的方向。尽管以前的作品研究了GAN模型，但我们提出了通过生成知识转移学习视觉变压器的秘诀。我们将框架基于最先进的生成视觉变压器，该变压器代表图像作为一系列视觉令牌，以供自回旋或非自动回忆性变压器。为了适应一个新的域，我们采用及时的调整，该调整将可学习的令牌预示到图像令牌序列上，并为我们的任务引入新的提示设计。我们研究各种视觉域，包括视觉任务适应基准〜\ cite {zhai2019large}，具有不同数量的培训图像，并显示了知识传递的有效性以及对现有作品的图像生成质量明显更好。

Transferring knowledge from an image synthesis model trained on a large dataset is a promising direction for learning generative image models from various domains efficiently. While previous works have studied GAN models, we present a recipe for learning vision transformers by generative knowledge transfer. We base our framework on state-of-the-art generative vision transformers that represent an image as a sequence of visual tokens to the autoregressive or non-autoregressive transformers. To adapt to a new domain, we employ prompt tuning, which prepends learnable tokens called prompt to the image token sequence, and introduce a new prompt design for our task. We study on a variety of visual domains, including visual task adaptation benchmark~\cite{zhai2019large}, with varying amount of training images, and show effectiveness of knowledge transfer and a significantly better image generation quality over existing works.

下载PDF全文

下载文献需遵守相关版权规定

论文标题