轻巧的数据融合与共轭映射

论文标题

轻巧的数据融合与共轭映射

Lightweight Data Fusion with Conjugate Mappings

论文作者

Dean, Christopher L., Lee, Stephen J., Pacheco, Jason, Fisher III, John W.

论文摘要

我们提出了一种数据融合的方法，该方法将结构化概率图形模型的解释性与神经网络的灵活性相结合。所提出的方法轻巧的数据融合（LDF）强调了使用两种信息对潜在变量的后验分析：主要数据，这些数据是良好的，但有限的可用性和辅助数据，并且易于获得，但缺乏对潜在利息的特征统计关系。缺乏用于辅助数据的正向模型无法使用标准数据融合方法的使用，而无法获得潜在可变观测值严重限制了大多数监督学习方法的直接应用。 LDF通过利用神经网络作为辅助数据的共轭映射来解决这些问题：非线性转换成相对于潜在变量的足够统计数据。这通过保留主要数据的共轭特性并导致潜在可变后验分布的紧凑表示，从而有助于有效推断。我们证明了有关两个具有挑战性的推理问题的LDF方法：（1）从卫星图像，高级网格基础设施和其他来源中学习卢旺达的电气化率；（2）通过使用多个共轭映射的混合模型整合社会经济数据，来推断美国的县级凶杀率。

We present an approach to data fusion that combines the interpretability of structured probabilistic graphical models with the flexibility of neural networks. The proposed method, lightweight data fusion (LDF), emphasizes posterior analysis over latent variables using two types of information: primary data, which are well-characterized but with limited availability, and auxiliary data, readily available but lacking a well-characterized statistical relationship to the latent quantity of interest. The lack of a forward model for the auxiliary data precludes the use of standard data fusion approaches, while the inability to acquire latent variable observations severely limits direct application of most supervised learning methods. LDF addresses these issues by utilizing neural networks as conjugate mappings of the auxiliary data: nonlinear transformations into sufficient statistics with respect to the latent variables. This facilitates efficient inference by preserving the conjugacy properties of the primary data and leads to compact representations of the latent variable posterior distributions. We demonstrate the LDF methodology on two challenging inference problems: (1) learning electrification rates in Rwanda from satellite imagery, high-level grid infrastructure, and other sources; and (2) inferring county-level homicide rates in the USA by integrating socio-economic data using a mixture model of multiple conjugate mappings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题