论文标题

基于流量的聚类和光谱聚类:比较

flow-based clustering and spectral clustering: a comparison

论文作者

SarcheshmehPour, Y., Tian, Y., Zhang, L., Jung, A.

论文摘要

我们建议并研究一种具有内在网络结构的数据的新型图形聚类方法。与光谱聚类类似,我们利用数据的内在网络结构来构建欧几里得特征向量。然后可以将这些特征向量馈入基本的聚类方法,例如基于K均值或高斯混合模型(GMM)的软聚类。除了光谱聚类之外,我们的方法设定的是,我们不使用图形laplacian的特征向量来构建特征向量。取而代之的是,我们使用总变异最小化问题的解决方案来构建反映数据点之间连接性的特征向量。我们的动机是,总变异最小化的解决方案在给定的一组种子节点周围是零件的常数。这些种子节点可以从域知识或基于数据网络结构的简单启发式方法中获得。我们的结果表明,我们的聚类方法可以应对某些对于光谱聚类方法具有挑战性的图形结构。

We propose and study a novel graph clustering method for data with an intrinsic network structure. Similar to spectral clustering, we exploit an intrinsic network structure of data to construct Euclidean feature vectors. These feature vectors can then be fed into basic clustering methods such as k-means or Gaussian mixture model (GMM) based soft clustering. What sets our approach apart from spectral clustering is that we do not use the eigenvectors of a graph Laplacian to construct the feature vectors. Instead, we use the solutions of total variation minimization problems to construct feature vectors that reflect connectivity between data points. Our motivation is that the solutions of total variation minimization are piece-wise constant around a given set of seed nodes. These seed nodes can be obtained from domain knowledge or by simple heuristics that are based on the network structure of data. Our results indicate that our clustering methods can cope with certain graph structures that are challenging for spectral clustering methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源