EGRC-NET：嵌入式诱导的图形细化聚类网络

论文标题

EGRC-NET：嵌入式诱导的图形细化聚类网络

EGRC-Net: Embedding-induced Graph Refinement Clustering Network

论文作者

Peng, Zhihao, Liu, Hui, Jia, Yuheng, Hou, Junhui

论文摘要

现有的图形聚类网络在很大程度上依赖于预定义但固定的图，这可能会导致失败当初始图无法准确捕获嵌入式空间的数据拓扑结构。为了解决此问题，我们提出了一个新颖的聚类网络，称为嵌入式诱导的图形细化聚类网络（EGRC-NET），该网络有效地利用了学习的嵌入来适应性地完善初始图并增强聚类性能。首先，我们通过分别采用香草自动编码器和图形卷积网络来利用语义和拓扑信息来学习潜在的特征表示。随后，我们利用特征嵌入空间内的局部几何结构来构建图形的邻接矩阵。使用我们建议的融合体系结构，该邻接矩阵与初始矩阵动态融合。要以无监督的方式训练网络，我们将多个派生分布之间的jeffreys差异最小化。此外，我们引入了改进的神经预测的近似个性化传播，以取代标准的图形卷积网络，从而使EGRC-NET有效地扩展。通过在九个广泛使用的基准数据集上进行的广泛实验，我们证明了我们提出的方法始终超过了几种最新方法。值得注意的是，EGRC-NET在调整后的RAND指数（ARI）中取得了超过11.99 \％的改进，而不是DBLP数据集中的最佳基线。此外，我们的可扩展方法在ARI中的增长率为10.73％，同时将记忆使用量减少33.73％，并将运行时间减少19.71％。 EGRC-NET的代码将在\ url {https://github.com/zhihaopeng-cityu/egrc-net}上公开获得。

Existing graph clustering networks heavily rely on a predefined yet fixed graph, which can lead to failures when the initial graph fails to accurately capture the data topology structure of the embedding space. In order to address this issue, we propose a novel clustering network called Embedding-Induced Graph Refinement Clustering Network (EGRC-Net), which effectively utilizes the learned embedding to adaptively refine the initial graph and enhance the clustering performance. To begin, we leverage both semantic and topological information by employing a vanilla auto-encoder and a graph convolution network, respectively, to learn a latent feature representation. Subsequently, we utilize the local geometric structure within the feature embedding space to construct an adjacency matrix for the graph. This adjacency matrix is dynamically fused with the initial one using our proposed fusion architecture. To train the network in an unsupervised manner, we minimize the Jeffreys divergence between multiple derived distributions. Additionally, we introduce an improved approximate personalized propagation of neural predictions to replace the standard graph convolution network, enabling EGRC-Net to scale effectively. Through extensive experiments conducted on nine widely-used benchmark datasets, we demonstrate that our proposed methods consistently outperform several state-of-the-art approaches. Notably, EGRC-Net achieves an improvement of more than 11.99\% in Adjusted Rand Index (ARI) over the best baseline on the DBLP dataset. Furthermore, our scalable approach exhibits a 10.73% gain in ARI while reducing memory usage by 33.73% and decreasing running time by 19.71%. The code for EGRC-Net will be made publicly available at \url{https://github.com/ZhihaoPENG-CityU/EGRC-Net}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题