论文标题
通过有效的本地搜索增强平衡图边缘分区
Enhancing Balanced Graph Edge Partition with Effective Local Search
论文作者
论文摘要
图形分区是在并行图处理系统中实现工作负载平衡并减少工作完成时间的关键组件。在各种分区策略中,边缘分区在幂律图中表现出比顶点分区更有希望的性能,因此,现有图形系统被更广泛地采用为默认分区策略。从优化和算法的角度研究了图形边缘分区问题,即将边缘设置为多个平衡部分以最大程度地减少复制顶点的总数。在本文中,我们研究了此问题的本地搜索算法,以进一步改善现有方法的分区结果。更具体地说,我们提出了两个新颖的概念,即可调节的边缘和块。基于这些,我们利用Max-Flow模型的属性开发了一种贪婪的启发式以及改进的搜索算法。为了评估算法的性能,我们首先根据近似质量提供了足够的理论分析。对于此问题,我们显着提高了先前已知的近似值。然后,我们对大量基准数据集和最先进的边缘分区策略进行了广泛的实验。结果表明,我们提出的本地搜索框架可以通过广泛的边距进一步提高图形分区的质量。
Graph partition is a key component to achieve workload balance and reduce job completion time in parallel graph processing systems. Among the various partition strategies, edge partition has demonstrated more promising performance in power-law graphs than vertex partition and thereby has been more widely adopted as the default partition strategy by existing graph systems. The graph edge partition problem, which is to split the edge set into multiple balanced parts to minimize the total number of copied vertices, has been widely studied from the view of optimization and algorithms. In this paper, we study local search algorithms for this problem to further improve the partition results from existing methods. More specifically, we propose two novel concepts, namely adjustable edges and blocks. Based on these, we develop a greedy heuristic as well as an improved search algorithm utilizing the property of the max-flow model. To evaluate the performance of our algorithms, we first provide adequate theoretical analysis in terms of the approximation quality. We significantly improve the previously known approximation ratio for this problem. Then we conduct extensive experiments on a large number of benchmark datasets and state-of-the-art edge partition strategies. The results show that our proposed local search framework can further improve the quality of graph partition by a wide margin.