论文标题

高性能SIMD模块化算术用于多项式评估

High performance SIMD modular arithmetic for polynomial evaluation

论文作者

Fortin, Pierre, Fleury, Ambroise, Lemaire, François, Monagan, Michael

论文摘要

计算机代数中的两个基本问题,即多项式分解和多项式最大的常见分裂计算,可以通过使用模块化算术的两个变量进行多种多项式评估来有效地解决。在本文中,我们着重于在一个单一CPU核心上对此类多项式评估的有效计算。我们首先使用Interins和OpenMP编译器指令来展示如何利用AVX2和AVX-512单元上的模块化算术计算的SIMD计算。然后,我们设法提高了操作强度并利用指导级并行性,以提高这些多项式评估的计算效率。所有这些都导致在AVX-512上的AVX2和10倍上的性能增长到约5倍。

Two essential problems in Computer Algebra, namely polynomial factorization and polynomial greatest common divisor computation, can be efficiently solved thanks to multiple polynomial evaluations in two variables using modular arithmetic. In this article, we focus on the efficient computation of such polynomial evaluations on one single CPU core. We first show how to leverage SIMD computing for modular arithmetic on AVX2 and AVX-512 units, using both intrinsics and OpenMP compiler directives. Then we manage to increase the operational intensity and to exploit instruction-level parallelism in order to increase the compute efficiency of these polynomial evaluations. All this results in the end to performance gains up to about 5x on AVX2 and 10x on AVX-512.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源