论文标题
生产力达到绩效:朱莉娅在A64FX上
Productivity meets Performance: Julia on A64FX
论文作者
论文摘要
富士通A64FX ARM的处理器用于超级计算机,例如日本的富加库和英国的Isambard 2,并提供了有趣的硬件功能(例如可扩展矢量扩展(SVE))的有趣组合,以及对减少精确浮动点的本机支持。本文的目的是在A64FX处理器上探索Julia编程语言的性能,并特别关注降低精度。在这里,我们提出了一项关于Axpy的性能研究,以验证编译管道,表明Julia可以符合调整的库的性能。此外,我们研究了fugaku上的消息传递界面(MPI)的可伸缩性和吞吐量分析,几乎没有其MPI接口的朱莉娅大开销。为了探索朱莉娅(Julia)针对各种浮点精度的可用性,我们提出了showwaters.jl的结果,jl是一种可以执行各种精确度的浅水模型。即使对于如此复杂的应用程序,朱莉娅(Julia)的类型型编程范式也提供了生产力和性能。
The Fujitsu A64FX ARM-based processor is used in supercomputers such as Fugaku in Japan and Isambard 2 in the UK and provides an interesting combination of hardware features such as Scalable Vector Extension (SVE), and native support for reduced-precision floating-point arithmetic. The goal of this paper is to explore performance of the Julia programming language on the A64FX processor, with a particular focus on reduced precision. Here, we present a performance study on axpy to verify the compilation pipeline, demonstrating that Julia can match the performance of tuned libraries. Additionally, we investigate Message Passing Interface (MPI) scalability and throughput analysis on Fugaku showing next to no significant overheads of Julia of its MPI interface. To explore the usability of Julia to target various floating-point precisions, we present results of ShallowWaters.jl, a shallow water model that can be executed a various levels of precision. Even for such complex applications, Julia's type-flexible programming paradigm offers both, productivity and performance.