通过采样压缩比预测稀疏基质乘法的输出结构

论文标题

通过采样压缩比预测稀疏基质乘法的输出结构

Predicting the Output Structure of Sparse Matrix Multiplication with Sampled Compression Ratio

论文作者

Du, Zhaoyang, Guan, Yijin, Guan, Tianchan, Niu, Dimin, Tan, Nianxiong, Yu, Xiaopeng, Zheng, Hongzhong, Meng, Jianyi, Yan, Xiaolang, Xie, Yuan

论文摘要

稀疏的一般矩阵乘法（SPGEMM）是许多科学应用中的基本基础。 SPGEMM的一项关键任务是计算或预测有效的内存分配和负载平衡的输出矩阵的结构（即，每个输出行的非零元素的数量），这会影响SPGEMM的整体性能。现有的工作要么精确地计算出输出结构，要么采用基于上限或采样的方法来预测输出结构。但是，这些方法要么需要太多执行时间，要么不够准确。在本文中，我们提出了一种基于采样的新方法，与现有基于采样的方法相比，具有更好的精度和低成本。提出的方法首先通过利用同一采样结果矩阵的中间产物数量（表示为flop）来预测SPGEMM的压缩比和非零元素（表示为NNZ）的数量。然后，通过将每个输出行除以预测的压缩率来获得预测的输出结构。我们还建议使用优化的计算开销的现有基于采样方法的参考设计，以证明所提出的方法的准确性。我们构建具有各种基质维度和稀疏结构的625个测试用例，以评估预测准确性。实验结果表明，在最坏的情况下，所提出方法和参考设计的绝对相对误差分别为1.56 \％和8.12 \％，平均分别为25 \％和156 \％。

Sparse general matrix multiplication (SpGEMM) is a fundamental building block in numerous scientific applications. One critical task of SpGEMM is to compute or predict the structure of the output matrix (i.e., the number of nonzero elements per output row) for efficient memory allocation and load balance, which impact the overall performance of SpGEMM. Existing work either precisely calculates the output structure or adopts upper-bound or sampling-based methods to predict the output structure. However, these methods either take much execution time or are not accurate enough. In this paper, we propose a novel sampling-based method with better accuracy and low costs compared to the existing sampling-based method. The proposed method first predicts the compression ratio of SpGEMM by leveraging the number of intermediate products (denoted as FLOP) and the number of nonzero elements (denoted as NNZ) of the same sampled result matrix. And then, the predicted output structure is obtained by dividing the FLOP per output row by the predicted compression ratio. We also propose a reference design of the existing sampling-based method with optimized computing overheads to demonstrate the better accuracy of the proposed method. We construct 625 test cases with various matrix dimensions and sparse structures to evaluate the prediction accuracy. Experimental results show that the absolute relative errors of the proposed method and the reference design are 1.56\% and 8.12\%, respectively, on average, and 25\% and 156\%, respectively, in the worst case.

下载PDF全文

下载文献需遵守相关版权规定

论文标题