伊恩：迭代的自适应社区，用于多种学习和维度估算

论文标题

伊恩：迭代的自适应社区，用于多种学习和维度估算

IAN: Iterated Adaptive Neighborhoods for manifold learning and dimensionality estimation

论文作者

Dyballa, Luciano, Zucker, Steven W.

论文摘要

在机器学习中调用多种假设需要了解歧管的几何形状和维度，理论决定了需要多少样本。但是，在应用程序数据中，采样可能不统一，而多种属性是未知的，并且（可能是）非纯化；这意味着社区必须适应本地结构。我们介绍了一种用于推断相似性内核给出的数据的自适应社区的算法。从本地保守的邻居（Gabriel）图开始，我们根据加权对应物对其进行迭代。在每个步骤中，线性程序在全球范围内产生最小的社区，并且体积统计数据揭示了邻居异常值可能违反了歧管几何形状。我们将自适应邻域应用于非线性维度降低，地球计算和维度估计。与标准算法的比较，例如使用K-Nearest邻居，证明了它们的实用性。我们的算法代码将在https://github.com/dyballa/ian上找到

Invoking the manifold assumption in machine learning requires knowledge of the manifold's geometry and dimension, and theory dictates how many samples are required. However, in applications data are limited, sampling may not be uniform, and manifold properties are unknown and (possibly) non-pure; this implies that neighborhoods must adapt to the local structure. We introduce an algorithm for inferring adaptive neighborhoods for data given by a similarity kernel. Starting with a locally-conservative neighborhood (Gabriel) graph, we sparsify it iteratively according to a weighted counterpart. In each step, a linear program yields minimal neighborhoods globally and a volumetric statistic reveals neighbor outliers likely to violate manifold geometry. We apply our adaptive neighborhoods to non-linear dimensionality reduction, geodesic computation and dimension estimation. A comparison against standard algorithms using, e.g., k-nearest neighbors, demonstrates their usefulness. Code for our algorithm will be available at https://github.com/dyballa/IAN

下载PDF全文

下载文献需遵守相关版权规定

论文标题