论文标题
无需进行离散傅立叶变换的混合MPI-CUDA方法
A Hybrid MPI-CUDA Approach for Nonequispaced Discrete Fourier Transformation
论文作者
论文摘要
无需进行离散的傅立叶变换(NDFT)广泛应用于计算科学和工程的各个方面。 NDFT的计算效率和准确性一直是阻碍其在科学计算的密集和广泛方面阻碍其全面应用的关键问题。在我们以前的工作(2018年,S.-C。Yang等人,Appl。Comput。Harmon。44,273)中,提出了一种CUNFFT方法,并在基于CUDA(计算统一设备架构)技术的中级量表处理NDFT方面表现出了出色的性能。在当前的工作中,我们使用有效的NFFT MPI-CUDA混合平行化(HP)方案进一步提高了CUNTTF方法的计算效率,以超扩展规模实现NDFT的最先进处理。在此HP-NFFT方法中,NDFT的空间结构域根据NDFT的累积特征和详细数量的CPU和GPU节点分解为多个部分。这些分解的NDFT子细胞使用MPI过程级并行模式在不同的CPU节点上独立计算,并使用CUDA Threadlevel并行化模式和CUNFFT算法在不同的GPU节点上计算出不同的GPU节点。 HP-NFFT方法的大量基准测试表明,该方法在超级扩展尺度上处理NDFT的计算效率显着提高而不会损失计算精度。此外,通过计算荧光岩晶体结构的Madelung常数验证了HP-NFFT方法,此后验证了该方法对于计算分子动力学模拟系统中带电离子之间的静电相互作用是可靠的。
Nonequispaced discrete Fourier transformation (NDFT) is widely applied in all aspects of computational science and engineering. The computational efficiency and accuracy of NDFT has always been a critical issue in hindering its comprehensive applications both in intensive and in extensive aspects of scientific computing. In our previous work (2018, S.-C. Yang et al., Appl. Comput. Harmon. Anal. 44, 273), a CUNFFT method was proposed and it shown outstanding performance in handling NDFT at intermediate scale based on CUDA (Compute Unified Device Architecture) technology. In the current work, we further improved the computational efficiency of the CUNTTF method using an efficient MPI-CUDA hybrid parallelization (HP) scheme of NFFT to achieve a cutting-edge treatment of NDFT at super extended scale. Within this HP-NFFT method, the spatial domain of NDFT is decomposed into several parts according to the accumulative feature of NDFT and the detailed number of CPU and GPU nodes. These decomposed NDFT subcells are independently calculated on different CPU nodes using a MPI process-level parallelization mode, and on different GPU nodes using a CUDA threadlevel parallelization mode and CUNFFT algorithm. A massive benchmarking of the HP-NFFT method indicates that this method exhibit a dramatic improvement in computational efficiency for handling NDFT at super extended scale without loss of computational precision. Furthermore, the HP-NFFT method is validated via the calculation of Madelung constant of fluorite crystal structure, and thereafter verified that this method is robust for the calculation of electrostatic interactions between charged ions in molecular dynamics simulation systems.