深度平衡模型作为连续变量的估计器

论文标题

深度平衡模型作为连续变量的估计器

Deep equilibrium models as estimators for continuous latent variables

论文作者

Tsuchida, Russell, Ong, Cheng Soon

论文摘要

主成分分析（PCA）及其指数族的扩展具有三个组成部分：线性转换的观测，潜在和参数。我们考虑一个广义设置，其中指数家族的规范参数是潜伏期的非线性转换。我们显示了特定神经网络体系结构与相应统计模型之间的明确关系。我们发现，深度平衡模型（最近引入的一类隐式神经网络）解决了转换的潜在和参数的最大A-posteriori（MAP）估计。我们的分析提供了一种将激活功能，辍学和层结构与观测值的统计假设联系起来的系统方法，从而为无监督的DEQ提供了基础原理。对于层次的潜在，可以将单个神经元解释为深层图形模型中的节点。我们的DEQ功能地图是端到端可区分的，可以为下游任务进行微调。

Principal Component Analysis (PCA) and its exponential family extensions have three components: observations, latents and parameters of a linear transformation. We consider a generalised setting where the canonical parameters of the exponential family are a nonlinear transformation of the latents. We show explicit relationships between particular neural network architectures and the corresponding statistical models. We find that deep equilibrium models -- a recently introduced class of implicit neural networks -- solve maximum a-posteriori (MAP) estimates for the latents and parameters of the transformation. Our analysis provides a systematic way to relate activation functions, dropout, and layer structure, to statistical assumptions about the observations, thus providing foundational principles for unsupervised DEQs. For hierarchical latents, individual neurons can be interpreted as nodes in a deep graphical model. Our DEQ feature maps are end-to-end differentiable, enabling fine-tuning for downstream tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题