论文标题
变异自动编码器的观察空间中的结构不确定性
Structured Uncertainty in the Observation Space of Variational Autoencoders
论文作者
论文摘要
变分自动编码器(VAE)是一类流行的深层生成模型,具有许多变体和广泛的应用。对标准VAE的改进主要集中在潜在空间和神经网络解码器的性质上的后验分布建模。相反,很少考虑改善观察分布的模型,通常默认为像素独立的分类或正态分布。在图像合成中,来自此类分布的采样产生具有不相关像素噪声的空间成分结果,导致样品平均值作为输出预测有些有用。在本文中,我们的目标是通过改善观察分布的样本来忠于VAE理论。我们提出了SOS-VAE,这是观测空间的替代模型,通过低级别的参数化编码空间依赖性。我们证明,这种新的观察分布具有捕获像素之间相关的协方差的能力,从而产生了空间连接的样品。与Pixel独立分布相反,我们的样本似乎包含了从平均值中的语义含义变化,允许以单个正向通行证预测多个合理输出。
Variational autoencoders (VAEs) are a popular class of deep generative models with many variants and a wide range of applications. Improvements upon the standard VAE mostly focus on the modelling of the posterior distribution over the latent space and the properties of the neural network decoder. In contrast, improving the model for the observational distribution is rarely considered and typically defaults to a pixel-wise independent categorical or normal distribution. In image synthesis, sampling from such distributions produces spatially-incoherent results with uncorrelated pixel noise, resulting in only the sample mean being somewhat useful as an output prediction. In this paper, we aim to stay true to VAE theory by improving the samples from the observational distribution. We propose SOS-VAE, an alternative model for the observation space, encoding spatial dependencies via a low-rank parameterisation. We demonstrate that this new observational distribution has the ability to capture relevant covariance between pixels, resulting in spatially-coherent samples. In contrast to pixel-wise independent distributions, our samples seem to contain semantically-meaningful variations from the mean allowing the prediction of multiple plausible outputs with a single forward pass.