树！我不是树！我是一个低维双曲线嵌入

论文标题

树！我不是树！我是一个低维双曲线嵌入

Tree! I am no Tree! I am a Low Dimensional Hyperbolic Embedding

论文作者

Sonthalia, Rishi, Gilbert, Anna C.

论文摘要

给定数据，找到数据的忠实低维双曲线嵌入是一种关键方法，我们可以通过它提取分层信息或学习数据的代表性几何特征。在本文中，我们通过采用度量优先的方法来探讨一种学习双曲线表示的新方法。我们没有直接确定低维双曲线嵌入，而是学习数据上的树结构。然后，该树结构可直接用于提取层次信息，使用Sarkar的Construction \ Cite {Sarkar}嵌入双曲线歧管中，或用作原始度量的树近似。为此，我们提出了一种新颖的快速算法\ textsc {treerep}，以便给定$δ$ - 溶质度度量（对于任何$δ\ geq 0 $），该算法学习了一个近似于原始度量的树结构。在$δ= 0 $的情况下，我们通过分析表明\ textsc {treerep}恰好恢复原始树结构。我们从经验上表明，\ textsc {treerep}不仅比以前已知的算法快很多数量级，而且还会产生平均失真和平均平均精度较低的指标，而平均平均精度则比大多数以前的学习双曲线嵌入算法，用于学习杂音嵌入的算法，通过树准提取层次计量，以及通过树准计量来提取层次。

Given data, finding a faithful low-dimensional hyperbolic embedding of the data is a key method by which we can extract hierarchical information or learn representative geometric features of the data. In this paper, we explore a new method for learning hyperbolic representations by taking a metric-first approach. Rather than determining the low-dimensional hyperbolic embedding directly, we learn a tree structure on the data. This tree structure can then be used directly to extract hierarchical information, embedded into a hyperbolic manifold using Sarkar's construction \cite{sarkar}, or used as a tree approximation of the original metric. To this end, we present a novel fast algorithm \textsc{TreeRep} such that, given a $δ$-hyperbolic metric (for any $δ\geq 0$), the algorithm learns a tree structure that approximates the original metric. In the case when $δ= 0$, we show analytically that \textsc{TreeRep} exactly recovers the original tree structure. We show empirically that \textsc{TreeRep} is not only many orders of magnitude faster than previously known algorithms, but also produces metrics with lower average distortion and higher mean average precision than most previous algorithms for learning hyperbolic embeddings, extracting hierarchical information, and approximating metrics via tree metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题