论文标题

Slimsell:用于广度优先搜索的可矢量化图表

SlimSell: A Vectorizable Graph Representation for Breadth-First Search

论文作者

Besta, Maciej, Marending, Florian, Solomonik, Edgar, Hoefler, Torsten

论文摘要

矢量化和GPU将深刻改变图形处理。针对基于32或64位的内存访问调整的传统图形算法将在Intel骑士登陆(KNL)Man​​ulcore CPU中已经存在的512位宽(或更大)指令单元的体系结构效率低下。在预期这一转变时,我们提出了Slimsell:可矢量化的图表表示,以基于稀疏的Matrix密度向量(SPMV)产品加速广度优先搜索(BFS)。 Slimsell扩展并结合了最先进的SIMD友好型C-Sigma矩阵存储格式,以及热带,真实,布尔值和SEL-MAX半度操作。所得设计可将必要的存储空间减少(最多50%),从而在内存子系统上施加压力。我们使用纤薄和纤巧的方案增强了Slimsell,从而减少了工作量并提高了负载平衡,进一步加速了BF。我们评估了Intel Haswell Multicore CPU,最先进的Intel Xeon Phi Knl多核CPU和Nvidia Tesla GPU的所有方案。我们的实验表明,哪种半尿道为BF提供了最高的加速,并说明Slimsell将调谐的Graph500 BFS代码加速高达33%。这项工作表明,矢量化可以基于SPMV产品在BFS中确保高性能。提出的原理和设计可以扩展到其他图算法。

Vectorization and GPUs will profoundly change graph processing. Traditional graph algorithms tuned for 32- or 64-bit based memory accesses will be inefficient on architectures with 512-bit wide (or larger) instruction units that are already present in the Intel Knights Landing (KNL) manycore CPU. Anticipating this shift, we propose SlimSell: a vectorizable graph representation to accelerate Breadth-First Search (BFS) based on sparse-matrix dense-vector (SpMV) products. SlimSell extends and combines the state-of-the-art SIMD-friendly Sell-C-sigma matrix storage format with tropical, real, boolean, and sel-max semiring operations. The resulting design reduces the necessary storage (by up to 50%) and thus pressure on the memory subsystem. We augment SlimSell with the SlimWork and SlimChunk schemes that reduce the amount of work and improve load balance, further accelerating BFS. We evaluate all the schemes on Intel Haswell multicore CPUs, the state-of-the-art Intel Xeon Phi KNL manycore CPUs, and NVIDIA Tesla GPUs. Our experiments indicate which semiring offers highest speedups for BFS and illustrate that SlimSell accelerates a tuned Graph500 BFS code by up to 33%. This work shows that vectorization can secure high-performance in BFS based on SpMV products; the proposed principles and designs can be extended to other graph algorithms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源