论文标题
利用动态稀疏矩阵用于性能便携式线性代数操作
Exploiting dynamic sparse matrices for performance portable linear algebra operations
论文作者
论文摘要
稀疏的矩阵和线性代数是科学模拟的核心。多年来,已经开发了70多种稀疏矩阵存储格式,针对多种硬件架构和基质类型。开发了每种格式以利用架构的特定优势或矩阵的特定稀疏模式,并且正确格式的选择对于实现最佳性能至关重要。采用动态稀疏矩阵,可以改变基本数据结构以匹配运行时的计算,而无需引入过度的开销,这有可能通过动态格式选择来优化性能。 在本文中,我们介绍了Morpheus,这是一个为动态稀疏矩阵提供有效抽象的库。动态矩阵的采用旨在提高开发人员和最终用户的生产率,这些开发人员和最终用户不需要了解和理解可用的不同格式的实施细节,但仍然希望利用优化机会来提高其应用程序的性能。我们证明,通过移植HPCG使用morpheus,没有进一步的代码更改,1)HPCG现在可以针对异质环境,2)SPMV内核的性能分别在CPU和GPU上分别提高了2.5倍和7倍,通过在每个MPI流程上选择最佳格式的运行时选择。
Sparse matrices and linear algebra are at the heart of scientific simulations. More than 70 sparse matrix storage formats have been developed over the years, targeting a wide range of hardware architectures and matrix types. Each format is developed to exploit the particular strengths of an architecture, or the specific sparsity patterns of matrices, and the choice of the right format can be crucial in order to achieve optimal performance. The adoption of dynamic sparse matrices that can change the underlying data-structure to match the computation at runtime without introducing prohibitive overheads has the potential of optimizing performance through dynamic format selection. In this paper, we introduce Morpheus, a library that provides an efficient abstraction for dynamic sparse matrices. The adoption of dynamic matrices aims to improve the productivity of developers and end-users who do not need to know and understand the implementation specifics of the different formats available, but still want to take advantage of the optimization opportunity to improve the performance of their applications. We demonstrate that by porting HPCG to use Morpheus, and without further code changes, 1) HPCG can now target heterogeneous environments and 2) the performance of the SpMV kernel is improved up to 2.5x and 7x on CPUs and GPUs respectively, through runtime selection of the best format on each MPI process.