论文标题
用于学习混合潜在表示的私人共享的分离多模式VAE
Private-Shared Disentangled Multimodal VAE for Learning of Hybrid Latent Representations
论文作者
论文摘要
多模式生成模型代表了一个重要的深层模型家族,其目标是促进对具有多种视图或方式的数据的表示。但是,当前的深层多模式模型集中于共享表示形式的推断,同时忽略了单个模式中数据的重要私人方面。在本文中,我们介绍了一个分离的多模式变分自动编码器(DMVAE),该变量自动编码器(DMVAE)利用分离的VAE策略将多种方式的私有和共享潜在空间分开。我们特别考虑了潜在因子可能具有连续和离散性质的情况,从而导致一般混合DMVAE模型的家族。我们证明了DMVAE在半监督的学习任务上的实用性,其中一种模式包含部分数据标签,与其他模式相关且无关。我们对几个基准测试的实验表明,私人共享的分解以及混合潜在代表的重要性。
Multi-modal generative models represent an important family of deep models, whose goal is to facilitate representation learning on data with multiple views or modalities. However, current deep multi-modal models focus on the inference of shared representations, while neglecting the important private aspects of data within individual modalities. In this paper, we introduce a disentangled multi-modal variational autoencoder (DMVAE) that utilizes disentangled VAE strategy to separate the private and shared latent spaces of multiple modalities. We specifically consider the instance where the latent factor may be of both continuous and discrete nature, leading to the family of general hybrid DMVAE models. We demonstrate the utility of DMVAE on a semi-supervised learning task, where one of the modalities contains partial data labels, both relevant and irrelevant to the other modality. Our experiments on several benchmarks indicate the importance of the private-shared disentanglement as well as the hybrid latent representation.