论文标题

Wasserstein $ K $ -NN分类器的普遍一致性:负面和积极的结果

Universal consistency of Wasserstein $k$-NN classifier: Negative and Positive Results

论文作者

Ponnoprat, Donlapark

论文摘要

Wasserstein距离提供了概率度量之间的差异概念,该概率度量最近在学习具有不同大小的结构化数据(例如图像和文本文档)方面存在应用。在这项工作中,我们研究了Wasserstein距离下的$ K $ neart最邻居分类器($ k $ -nn)的概率度量。我们表明,在$(0,1)$中支持的措施空间中,$ K $ -NN分类器并不普遍。由于任何欧几里得球都包含$(0,1)$的副本,因此,如果不限制基本公制空间或Wasserstein Space本身,则不应期望获得普遍的一致性。为此,通过$σ$ -finite度量尺寸的概念,我们表明$ k $ -nn分类器在$σ$均匀离散设置的度量空间上普遍一致。此外,通过研究Wasserstein空间的地球结构,价格为$ P = 1 $和$ P = 2 $,我们表明,$ K $ -NN分类器在有限套装,高斯度量的空间以及与有限的度量的有限型量相位的测量空间中普遍一致,以限制量。

The Wasserstein distance provides a notion of dissimilarities between probability measures, which has recent applications in learning of structured data with varying size such as images and text documents. In this work, we study the $k$-nearest neighbor classifier ($k$-NN) of probability measures under the Wasserstein distance. We show that the $k$-NN classifier is not universally consistent on the space of measures supported in $(0,1)$. As any Euclidean ball contains a copy of $(0,1)$, one should not expect to obtain universal consistency without some restriction on the base metric space, or the Wasserstein space itself. To this end, via the notion of $σ$-finite metric dimension, we show that the $k$-NN classifier is universally consistent on spaces of measures supported in a $σ$-uniformly discrete set. In addition, by studying the geodesic structures of the Wasserstein spaces for $p=1$ and $p=2$, we show that the $k$-NN classifier is universally consistent on the space of measures supported on a finite set, the space of Gaussian measures, and the space of measures with densities expressed as finite wavelet series.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源