针对人类尺度血流模拟的Helelb GPU代码的开发和性能

论文标题

针对人类尺度血流模拟的Helelb GPU代码的开发和性能

Development and performance of a HemeLB GPU code for human-scale blood flow simulation

论文作者

Zacharoudiou, I., McCullough, J. W. S., Coveney, P. V.

论文摘要

近年来，高性能计算机（HPC）拥有一定程度的异质体系结构已变得越来越普遍 - 通常以GPU加速器的形式。在某些计算机中，这些是在专用分区中隔离的，而在其他机器中，它们与所有计算节点不可或缺 - 通常每个节点有多个GPU-并提供机器计算性能的大部分。鉴于这一趋势，在HPC上部署的代码被更新以在Accelerator硬件上执行，这变得至关重要。在本文中，我们介绍了使用CUDA C ++开发的3D血流模拟代码囊的GPU实施。我们展示了与仅在构建的同等CPU代码同时保留了CPU版本反复证明的出色的强缩放特性的同时，它如何利用NVIDIA GPU硬件可以实现重大的性能提高。随着HPC位于Exascale时代的边缘，我们以Selelb为动机，以讨论许多用户在即将到来的Exascale机器上部署自己的应用程序时将面临的一些挑战。

In recent years, it has become increasingly common for high performance computers (HPC) to possess some level of heterogeneous architecture - typically in the form of GPU accelerators. In some machines these are isolated within a dedicated partition, whilst in others they are integral to all compute nodes - often with multiple GPUs per node - and provide the majority of a machine's compute performance. In light of this trend, it is becoming essential that codes deployed on HPC are updated to execute on accelerator hardware. In this paper we introduce a GPU implementation of the 3D blood flow simulation code HemeLB that has been developed using CUDA C++. We demonstrate how taking advantage of NVIDIA GPU hardware can achieve significant performance improvements compared to the equivalent CPU only code on which it has been built whilst retaining the excellent strong scaling characteristics that have been repeatedly demonstrated by the CPU version. With HPC positioned on the brink of the exascale era, we use HemeLB as a motivation to provide a discussion on some of the challenges that many users will face when deploying their own applications on upcoming exascale machines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题