论文标题

Schoenberg-Rao距离:基于熵和几何学统计希尔伯特距离

Schoenberg-Rao distances: Entropy-based and geometry-aware statistical Hilbert distances

论文作者

Hadjeres, Gaëtan, Nielsen, Frank

论文摘要

考虑到其样品空间几何形状的概率分布之间的距离,例如Wasserstein或最大平均差异(MMD)距离在机器学习中引起了很多关注,例如,它们可以用来将概率分布与不相交的支持进行比较。在本文中,我们研究了一类统计希尔伯特距离,我们称我们为Schoenberg-rao距离,MMD的概括使人们可以考虑更广泛的核,即有条件的半偏见核。特别是,我们介绍了一种原则性的方式来构建此类内核并在高斯分布的混合物之间得出新颖的封闭形式距离。这些距离源自凹的二次熵的距离,享有不错的理论特性,并具有可解释的超参数,可以针对特定的应用进行调整。我们的方法构成了Wasserstein距离的实用替代方法,我们在广泛的机器学习任务上说明了其效率,例如密度估计,生成建模和混合物的简化。

Distances between probability distributions that take into account the geometry of their sample space,like the Wasserstein or the Maximum Mean Discrepancy (MMD) distances have received a lot of attention in machine learning as they can, for instance, be used to compare probability distributions with disjoint supports. In this paper, we study a class of statistical Hilbert distances that we term the Schoenberg-Rao distances, a generalization of the MMD that allows one to consider a broader class of kernels, namely the conditionally negative semi-definite kernels. In particular, we introduce a principled way to construct such kernels and derive novel closed-form distances between mixtures of Gaussian distributions. These distances, derived from the concave Rao's quadratic entropy, enjoy nice theoretical properties and possess interpretable hyperparameters which can be tuned for specific applications. Our method constitutes a practical alternative to Wasserstein distances and we illustrate its efficiency on a broad range of machine learning tasks such as density estimation, generative modeling and mixture simplification.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源