论文标题

在燃烧模拟上利用矢量化的便携式编码策略

A portable coding strategy to exploit vectorization on combustion simulations

论文作者

Banchelli, Fabio, Oyarzun, Guillermo, Garcia-Gasulla, Marta, Mantovani, Filippo, Both, Ambrus, Houzeaux, Guillaume, Mira, Daniel

论文摘要

燃烧模拟的复杂性需要最新的高性能计算工具,以加速其时间到解决的结果。 HPC系统的当前趋势是使用SIMD或向量扩展的CPU来利用数据并行性。我们的工作提出了一项策略,以改善有限元基于有限元的科学代码的自动矢量化。该方法将参数配置应用于数据结构,以帮助编译器检测可以利用矢量计算的代码块,同时维护代码便携式。在PrecCinsta燃烧器仿真上研究了该方法对该方法对CFD求解器不同阶段的计算影响的详细分析。我们的参数实现已证明可以帮助编译器在组装操作中生成更多的向量说明:这导致总执行指令的最多9.3倍,以保持每个周期和CPU频率的恒定指令。拟议的策略将研究中的CFD案例的性能提高了4.67倍,在MANENOSTRUM 4超级计算机上。

The complexity of combustion simulations demands the latest high-performance computing tools to accelerate its time-to-solution results. A current trend on HPC systems is the utilization of CPUs with SIMD or vector extensions to exploit data parallelism. Our work proposes a strategy to improve the automatic vectorization of finite element-based scientific codes. The approach applies a parametric configuration to the data structures to help the compiler detect the block of codes that can take advantage of vector computation while maintaining the code portable. A detailed analysis of the computational impact of this methodology on the different stages of a CFD solver is studied on the PRECCINSTA burner simulation. Our parametric implementation has proven to help the compiler generate more vector instructions in the assembly operation: this results in a reduction of up to 9.3 times of the total executed instruction maintaining constant the Instructions Per Cycle and the CPU frequency. The proposed strategy improves the performance of the CFD case under study up to 4.67 times on the MareNostrum 4 supercomputer.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源