论文标题
图像表示及其谎言不变的联合估计
Joint Estimation of Image Representations and their Lie Invariants
论文作者
论文摘要
图像编码世界状态及其内容。前者对诸如计划和控制以及后者分类等任务很有用。由于高维度和图像表示固有的编码,因此自动提取此信息是具有挑战性的。本文介绍了两种旨在解决这些挑战的理论方法。这些方法允许通过图像表示的关节估计和序列动力学的发电机从图像序列中插值和外推。在第一种方法中,使用概率PCA \ cite {Tipping1999 -Probabilistic}学习了图像表示。线性高斯条件分布允许对潜在分布进行封闭形式的分析描述,但假设基础图像歧管是线性子空间。在第二种方法中,使用概率的非线性PCA学习图像表示,该概率非线性PCA以需要对潜在分布的变异近似为成本来缓解线性歧管假设。在这两种方法中,图像序列的基本动力学都被明确建模,以使它们与图像表示。动力学本身是用谎言组结构建模的,该结构可以实现平滑度和形象间变换的合成性的理想特性。
Images encode both the state of the world and its content. The former is useful for tasks such as planning and control, and the latter for classification. The automatic extraction of this information is challenging because of the high-dimensionality and entangled encoding inherent to the image representation. This article introduces two theoretical approaches aimed at the resolution of these challenges. The approaches allow for the interpolation and extrapolation of images from an image sequence by joint estimation of the image representation and the generators of the sequence dynamics. In the first approach, the image representations are learned using probabilistic PCA \cite{tipping1999probabilistic}. The linear-Gaussian conditional distributions allow for a closed form analytical description of the latent distributions but assumes the underlying image manifold is a linear subspace. In the second approach, the image representations are learned using probabilistic nonlinear PCA which relieves the linear manifold assumption at the cost of requiring a variational approximation of the latent distributions. In both approaches, the underlying dynamics of the image sequence are modelled explicitly to disentangle them from the image representations. The dynamics themselves are modelled with Lie group structure which enforces the desirable properties of smoothness and composability of inter-image transformations.