论文标题
具有深神经混合模型的多元密度估计
Multivariate Density Estimation with Deep Neural Mixture Models
论文作者
论文摘要
尽管在最近关于机器学习的文献中(尤其是在深度学习)中令人担忧的是,多元密度估计是许多应用程序中的一项基本任务,至少是隐式的,仍然是一个空旷的问题。除少数例外外,深度神经网络(DNN)很少应用于密度估计,这主要是由于估计任务的无监督性质,(尤其是)由于需要约束训练算法而最终实现了适当的概率模型,从而满足了Kolmogorov的适当概率模型。此外,尽管在平坦的单密度统计估计器上产生的建模能力的建模能力有了众所周知的改进,但到目前为止,尚未研究过多变量DNN的成分密度的正确混合物。该论文通过将我们以前的神经混合物密度(NMM)的工作扩展到多元DNN混合物来填补这一空白。分发了用于估计深NMM(DNMMS)的最大似然(ML)算法,从数值上满足了旨在确保对Kolmogorov公理满意的硬和软约束的组合。正式定义了可以通过DNMMS对任何程度的精度进行建模的概率密度函数类别。提出了一种自动选择DNMM体系结构以及其ML训练算法的超参数的程序(利用DNMM的概率性质)。报告了关于单变量和多元数据的实验结果,从而证实了该方法的有效性及其对最流行的统计估计技术的优势。
Albeit worryingly underrated in the recent literature on machine learning in general (and, on deep learning in particular), multivariate density estimation is a fundamental task in many applications, at least implicitly, and still an open issue. With a few exceptions, deep neural networks (DNNs) have seldom been applied to density estimation, mostly due to the unsupervised nature of the estimation task, and (especially) due to the need for constrained training algorithms that ended up realizing proper probabilistic models that satisfy Kolmogorov's axioms. Moreover, in spite of the well-known improvement in terms of modeling capabilities yielded by mixture models over plain single-density statistical estimators, no proper mixtures of multivariate DNN-based component densities have been investigated so far. The paper fills this gap by extending our previous work on Neural Mixture Densities (NMMs) to multivariate DNN mixtures. A maximum-likelihood (ML) algorithm for estimating Deep NMMs (DNMMs) is handed out, which satisfies numerically a combination of hard and soft constraints aimed at ensuring satisfaction of Kolmogorov's axioms. The class of probability density functions that can be modeled to any degree of precision via DNMMs is formally defined. A procedure for the automatic selection of the DNMM architecture, as well as of the hyperparameters for its ML training algorithm, is presented (exploiting the probabilistic nature of the DNMM). Experimental results on univariate and multivariate data are reported on, corroborating the effectiveness of the approach and its superiority to the most popular statistical estimation techniques.