论文标题
Python的天气和气候建模的生产性绩效工程
Productive Performance Engineering for Weather and Climate Modeling with Python
论文作者
论文摘要
Earth System模型是通过紧密的耦合来靶向硬件的,通常包含基于处理器特征的专门代码。这种耦合源于使用硬编码计算时间表和布局的命令式语言。我们提出了优化有限体积的立方体动力学核心(FV3)的详细说明,从而提高了生产率和性能。通过使用声明的Python插入模具域特异性语言和以数据为中心的优化,我们抽象了特定于硬件的细节,并定义了半自动化的工作流程,以分析和优化天气和气候应用。工作流程利用本地和完整程序优化以及用户指导的微调。为了修剪不可行的全球优化空间,我们通过新颖的传输调谐方法自动利用重复代码图案。在Piz Daint SuperCuputer上,我们将其扩展到2,400 GPU,在原始代码的一小部分中,在调谐生产实现的情况下,达到3.92倍的加速度。
Earth system models are developed with a tight coupling to target hardware, often containing specialized code predicated on processor characteristics. This coupling stems from using imperative languages that hard-code computation schedules and layout. We present a detailed account of optimizing the Finite Volume Cubed-Sphere Dynamical Core (FV3), improving productivity and performance. By using a declarative Python-embedded stencil domain-specific language and data-centric optimization, we abstract hardware-specific details and define a semi-automated workflow for analyzing and optimizing weather and climate applications. The workflow utilizes both local and full-program optimization, as well as user-guided fine-tuning. To prune the infeasible global optimization space, we automatically utilize repeating code motifs via a novel transfer tuning approach. On the Piz Daint supercomputer, we scale to 2,400 GPUs, achieving speedups of up to 3.92x over the tuned production implementation at a fraction of the original code.