论文标题
高性能PDE求解器的可伸缩性
Scalability of High-Performance PDE Solvers
论文作者
论文摘要
性能测试和分析对于有效的HPC软件开发至关重要,并且是设计和实施计算算法的中心组件,以实现对大规模应用程序问题的现有和未来计算体系结构的更快模拟。在本文中,我们探讨了针对主要用于管理各种物理应用的PDE的重要计算密集型内核的性能和时空权衡。我们考虑了一系列由PDE动机的烘焙问题,旨在为各种代码和平台建立有效的高级模拟实践。我们在固定数量的节点上测量峰值性能(每秒自由度),并确定每个体系结构的有效代码优化策略。除了峰值性能外,我们还确定了以80%平行效率的最小溶液时间。性能分析基于光谱和P型有限元素,但同样适用于广泛的数值PDE离散量,包括有限差,有限的体积和H型有限元素。
Performance tests and analyses are critical to effective HPC software development and are central components in the design and implementation of computational algorithms for achieving faster simulations on existing and future computing architectures for large-scale application problems. In this paper, we explore performance and space-time trade-offs for important compute-intensive kernels of large-scale numerical solvers for PDEs that govern a wide range of physical applications. We consider a sequence of PDE- motivated bake-off problems designed to establish best practices for efficient high-order simulations across a variety of codes and platforms. We measure peak performance (degrees of freedom per second) on a fixed number of nodes and identify effective code optimization strategies for each architecture. In addition to peak performance, we identify the minimum time to solution at 80% parallel efficiency. The performance analysis is based on spectral and p-type finite elements but is equally applicable to a broad spectrum of numerical PDE discretizations, including finite difference, finite volume, and h-type finite elements.