立方体 - 朝着宇宙N体模拟的最佳缩放

论文标题

立方体 - 朝着宇宙N体模拟的最佳缩放

CUBE -- Towards an Optimal Scaling of Cosmological N-body Simulations

论文作者

Cheng, Shenggan, Yu, Hao-Ran, Inman, Derek, Liao, Qiucheng, Wu, Qiaoya, Lin, James

论文摘要

N体模拟是理解宇宙大规模结构（LSS）形成的物理宇宙学的重要工具。具有高分辨率的大规模模拟对于探索宇宙的子结构和确定基本物理参数（如中微子质量）很重要。但是，基于传统的粒子网（PM）算法使用大量内存，这限制了模拟的可扩展性。因此，我们设计了一个两级PM算法Cube，以减少记忆消耗的最佳性能。通过使用固定点压缩技术，立方体可以将每个N体粒子的内存消耗降低到6个字节，比传统的基于PM的算法低的数量级。我们将Cube缩放到了Intel Cascade Lake的超级计算机上的512个节点（20,480个核心），并具有$ \ simeq $ 95 \％弱尺度效率。该比例测试是在“ cosmo- $π$”中进行的 - 使用$ \ simeq $ 4.4万亿美元的颗粒的宇宙学LSS模拟，追踪了宇宙的演变$ \ simeq $ \ simeq $ 137亿年。据我们所知，Cosmo-$π$是最大的宇宙N体仿真。我们认为，Cube具有扩展Exascale超级计算机的巨大潜力，以进行更大的模拟。

N-body simulations are essential tools in physical cosmology to understand the large-scale structure (LSS) formation of the Universe. Large-scale simulations with high resolution are important for exploring the substructure of universe and for determining fundamental physical parameters like neutrino mass. However, traditional particle-mesh (PM) based algorithms use considerable amounts of memory, which limits the scalability of simulations. Therefore, we designed a two-level PM algorithm CUBE towards optimal performance in memory consumption reduction. By using the fixed-point compression technique, CUBE reduces the memory consumption per N-body particle toward 6 bytes, an order of magnitude lower than the traditional PM-based algorithms. We scaled CUBE to 512 nodes (20,480 cores) on an Intel Cascade Lake based supercomputer with $\simeq$95\% weak-scaling efficiency. This scaling test was performed in "Cosmo-$π$" -- a cosmological LSS simulation using $\simeq$4.4 trillion particles, tracing the evolution of the universe over $\simeq$13.7 billion years. To our best knowledge, Cosmo-$π$ is the largest completed cosmological N-body simulation. We believe CUBE has a huge potential to scale on exascale supercomputers for larger simulations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题