论文标题

大图聚类的随机可行特征膨胀

Stochastic Parallelizable Eigengap Dilation for Large Graph Clustering

论文作者

van der Pol, Elise, Gemp, Ian, Bachrach, Yoram, Everett, Richard

论文摘要

大图通常出现在社交网络,知识图,推荐系统,生命科学和决策问题中。通过其高级别属性总结大图有助于解决这些设置中的问题。在光谱聚类中,我们旨在确定大多数边缘落在簇内的节点簇,而在簇之间只有很少的边缘。此任务对于许多下游应用和探索性分析很重要。光谱聚类的核心步骤是执行相应图的拉普拉斯矩阵(或等效地,奇异值分解,svd,svd,svd)的特征分类。迭代奇异值分解方法的收敛取决于给定矩阵的频谱的特征,即连续特征值之间的差异。对于对应于群集图形的图形拉普拉斯式的图形,特征值将是非负的,但很小(小于$ 1 $)的收敛性会减慢收敛性。本文引入了一种可行的扩张频谱的方法,以加速SVD求解器并依次加速光谱聚类。这是通过对矩阵操作的多项式近似来实现的,该矩阵操作有利地改变矩阵的光谱而不更改其特征向量。实验表明,这种方法显着加速了收敛,我们解释了如何并行化和随机近似于可用的计算。

Large graphs commonly appear in social networks, knowledge graphs, recommender systems, life sciences, and decision making problems. Summarizing large graphs by their high level properties is helpful in solving problems in these settings. In spectral clustering, we aim to identify clusters of nodes where most edges fall within clusters and only few edges fall between clusters. This task is important for many downstream applications and exploratory analysis. A core step of spectral clustering is performing an eigendecomposition of the corresponding graph Laplacian matrix (or equivalently, a singular value decomposition, SVD, of the incidence matrix). The convergence of iterative singular value decomposition approaches depends on the eigengaps of the spectrum of the given matrix, i.e., the difference between consecutive eigenvalues. For a graph Laplacian corresponding to a well-clustered graph, the eigenvalues will be non-negative but very small (much less than $1$) slowing convergence. This paper introduces a parallelizable approach to dilating the spectrum in order to accelerate SVD solvers and in turn, spectral clustering. This is accomplished via polynomial approximations to matrix operations that favorably transform the spectrum of a matrix without changing its eigenvectors. Experiments demonstrate that this approach significantly accelerates convergence, and we explain how this transformation can be parallelized and stochastically approximated to scale with available compute.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源