拆卸对象表示没有标签

论文标题

拆卸对象表示没有标签

Disassembling Object Representations without Labels

论文作者

Feng, Zunlei, Wang, Xinchao, He, Yongming, Yuan, Yike, Gao, Xin, Song, Mingli

论文摘要

在本文中，我们研究了一项新的表示学习任务，我们将其称为拆卸对象表示。给定一个具有多个对象的图像，拆卸的目的是获取潜在表示，每个部分都对应于一个类别的对象。因此，拆卸将其应用于诸如图像编辑和少量或零局学习之类的广泛领域，因为它可以在学习表示中启用类别特定的模块化。为此，我们提出了一种无监督的方法来实现拆卸，称为无监督的拆卸对象表示（UDOR）。 Udor遵循双重自动编码器体系结构，其中强加了模糊分类和对象避开操作。模糊分类将潜在表示的每个部分都限制为编码一个最大对象类别的特征，而对象避免与生成对抗网络相结合，则实施了重建图像的表示和完整性的模块化。此外，我们设计了两个指标，以分别测量拆卸表示的模块化和重建图像的视觉完整性。实验结果表明，所提出的udor鄙视无监督的，在与受监督方法的结果相提并论的真正令人鼓舞的结果。

In this paper, we study a new representation-learning task, which we termed as disassembling object representations. Given an image featuring multiple objects, the goal of disassembling is to acquire a latent representation, of which each part corresponds to one category of objects. Disassembling thus finds its application in a wide domain such as image editing and few- or zero-shot learning, as it enables category-specific modularity in the learned representations. To this end, we propose an unsupervised approach to achieving disassembling, named Unsupervised Disassembling Object Representation (UDOR). UDOR follows a double auto-encoder architecture, in which a fuzzy classification and an object-removing operation are imposed. The fuzzy classification constrains each part of the latent representation to encode features of up to one object category, while the object-removing, combined with a generative adversarial network, enforces the modularity of the representations and integrity of the reconstructed image. Furthermore, we devise two metrics to respectively measure the modularity of disassembled representations and the visual integrity of reconstructed images. Experimental results demonstrate that the proposed UDOR, despited unsupervised, achieves truly encouraging results on par with those of supervised methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题