论文标题

基于边缘的本地推动个性化Pagerank

Edge-based Local Push for Personalized PageRank

论文作者

Wang, Hanzhi, Wei, Zhewei, Gan, Junhao, Yuan, Ye, Du, Xiaoyong, Wen, Ji-Rong

论文摘要

个性化Pagerank(PPR)是图挖掘和网络研究中流行的节点邻近度量。给定图G =(v,e)和v $中的源节点$ s \,单源ppr(ssppr)询问询问ppr value $ \ vpi(u)$相对于s,这代表了源节点s上下文中节点u的相对重要性。在现有的SSPPR查询算法中,LocalPush是一种基本方法,是后续算法的基石。在LocalPush中,推动操作是一个至关重要的原始操作,它通过相应的边缘将节点U的概率分配给所有U邻居。尽管此推动操作在未加权的图上效果很好,但不幸的是,它在加权图上效率相当低。特别是,在不平衡的加权图上,其中只有少数这些边缘占总重量的大部分,推动操作将必须沿着那些仅占据较小权重的边缘分发微不足道的概率,从而导致昂贵的开销。 为了解决此问题,我们提出了EdgePush算法,这是一种在加权图上计算SSPPR查询的新方法。 EdgePush以基于边缘的推动分解了上述推动操作,从而使算法可以在边缘级别粒度下运行。因此,它可以根据边缘的重量灵活地分配概率。此外,我们的EdgePush允许每个单个边缘的细粒度终止阈值,从而导致比LocalPush的较高复杂性。值得注意的是,我们证明,当图形的权重不平衡时,EdgePush将LocalPush的理论查询成本提高了最高O(n)的订单,无论是$ \ ell_1 $ - eRROR和归一化添加剂错误而言。我们的实验结果表明,在大型基于基序和现实世界的加权图上查询效率方面,EdgePush显着胜过最先进的基线。

Personalized PageRank (PPR) is a popular node proximity metric in graph mining and network research. Given a graph G=(V,E) and a source node $s \in V$, a single-source PPR (SSPPR) query asks for the PPR value $\vpi(u)$ with respect to s, which represents the relative importance of node u in the context of the source node s. Among existing algorithms for SSPPR queries, LocalPush is a fundamental method which serves as a cornerstone for subsequent algorithms. In LocalPush, a push operation is a crucial primitive operation, which distributes the probability at a node u to ALL u's neighbors via the corresponding edges. Although this push operation works well on unweighted graphs, unfortunately, it can be rather inefficient on weighted graphs. In particular, on unbalanced weighted graphs where only a few of these edges take the majority of the total weight among them, the push operation would have to distribute insignificant probabilities along those edges which just take the minor weights, resulting in expensive overhead. To resolve this issue, we propose the EdgePush algorithm, a novel method for computing SSPPR queries on weighted graphs. EdgePush decomposes the aforementioned push operations in edge-based push, allowing the algorithm to operate at the edge level granularity. Hence, it can flexibly distribute the probabilities according to edge weights. Furthermore, our EdgePush allows a fine-grained termination threshold for each individual edge, leading to a superior complexity over LocalPush. Notably, we prove that EdgePush improves the theoretical query cost of LocalPush by an order of up to O(n) when the graph's weights are unbalanced, both in terms of $\ell_1$-error and normalized additive error. Our experimental results demonstrate that EdgePush significantly outperforms state-of-the-art baselines in terms of query efficiency on large motif-based and real-world weighted graphs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源