有条件变异自动编码器中多模式潜在空间的证据稀疏

论文标题

有条件变异自动编码器中多模式潜在空间的证据稀疏

Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders

论文作者

Itkina, Masha, Ivanovic, Boris, Senanayake, Ransalu, Kochenderfer, Mykel J., Pavone, Marco

论文摘要

已证明各种自动编码器中的离散潜在空间可有效捕获许多现实世界中的问题，例如自然语言理解，人类意图预测和视觉场景表示。但是，离散的潜在空间需要足够大以捕获现实世界数据的复杂性，从而使下游任务在计算上具有挑战性。例如，在环境的高维潜图中执行运动计划可能是棘手的。我们考虑了稀疏训练有素的有条件变异自动编码器的离散潜在空间的问题，同时保留其学习的多模式。作为事后潜在空间减少技术，我们使用证据理论来识别从特定输入条件中接收直接证据的潜在类别，并过滤掉那些没有。关于不同任务的实验，例如图像产生和人类行为预测，证明了我们提出的技术在降低模型的离散潜在样品空间大小的同时，同时保持其学习的多模式的有效性。

Discrete latent spaces in variational autoencoders have been shown to effectively capture the data distribution for many real-world problems such as natural language understanding, human intent prediction, and visual scene representation. However, discrete latent spaces need to be sufficiently large to capture the complexities of real-world data, rendering downstream tasks computationally challenging. For instance, performing motion planning in a high-dimensional latent representation of the environment could be intractable. We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder, while preserving its learned multimodality. As a post hoc latent space reduction technique, we use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not. Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique at reducing the discrete latent sample space size of a model while maintaining its learned multimodality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题