论文标题

在接近线性时间内扰动稳定性下聚类

Clustering under Perturbation Stability in Near-Linear Time

论文作者

Agarwal, Pankaj K., Chang, Hsien-Chih, Munagala, Kamesh, Taylor, Erin, Welzl, Emo

论文摘要

我们考虑在扰动稳定性假设下,在低维欧几里德空间中基于中心聚类的问题。如果基础最佳聚类仍然保持最佳状态,即使所有成对距离都被任意地受到最多$α$的因素的影响,但实例是$α$稳定。我们的主要贡献是针对$α$稳定的聚类实例提出有效的精确算法,其运行时间在$α\ ge 2 + \ sqrt {3} $时几乎是基于数据集的大小。对于$ k $ - center和$ k $ -MEANS问题,我们的算法还可以在任何固定尺寸中的任何常数$ε> 0 $的$α\ geq 2 + \ sqrt {3} +ε$中实现$ k $的多项式依赖。对于$ k $ -Median,我们的算法在任何固定尺寸的$α> 5 $上具有多项式依赖性;对于$α\ geq 2 + \ sqrt {3} $,在两个维度上。我们的算法很简单,只需要应用技术,例如本地搜索或动态编程,以适当修改的度量空间,并仔细选择数据结构。

We consider the problem of center-based clustering in low-dimensional Euclidean spaces under the perturbation stability assumption. An instance is $α$-stable if the underlying optimal clustering continues to remain optimal even when all pairwise distances are arbitrarily perturbed by a factor of at most $α$. Our main contribution is in presenting efficient exact algorithms for $α$-stable clustering instances whose running times depend near-linearly on the size of the data set when $α\ge 2 + \sqrt{3}$. For $k$-center and $k$-means problems, our algorithms also achieve polynomial dependence on the number of clusters, $k$, when $α\geq 2 + \sqrt{3} + ε$ for any constant $ε> 0$ in any fixed dimension. For $k$-median, our algorithms have polynomial dependence on $k$ for $α> 5$ in any fixed dimension; and for $α\geq 2 + \sqrt{3}$ in two dimensions. Our algorithms are simple, and only require applying techniques such as local search or dynamic programming to a suitably modified metric space, combined with careful choice of data structures.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源