论文标题

加速数字理论转换,用于gpus上的Bootstrable同构加密

Accelerating Number Theoretic Transformations for Bootstrappable Homomorphic Encryption on GPUs

论文作者

Kim, Sangpyo, Jung, Wonkyung, Park, Jaiyoung, Ahn, Jung Ho

论文摘要

同态加密(HE)引起了极大的关注,因为它为加密消息提供了隐私计算的方式。数字理论转换(NTT)是整数有限场中的离散傅立叶变换(DFT)的一种专业形式,是一个关键算法,可以在HE中对加密密文进行快速计算。先前的工作通过利用DFT优化技术加速了NTT及其在流行的并行处理平台GPU上的反向转换。但是,这些基于GPU的研究缺乏对NTT和DFT之间的主要差异的全面分析,或者仅考虑小型参数,这些参数在算术操作的数量中具有严格的限制,而这些参数可以执行而无需撤销。在本文中,我们分析了NTT和DFT的算法特征,并评估NTT的性能,当我们应用通常适用于现代GPU的DFT和NTT的优化时。从分析中,我们确定NTT在大型参数集上遭受严重的主内存带宽瓶颈。为了解决主要的内存带宽问题,我们提出了一种新型的NTT特异性根生成方案,称为twiddling(OT)。与基线RADIX-2 NTT实现相比,应用了包括OT在内的所有优化后,我们在现代GPU上实现了4.2倍的速度。

Homomorphic encryption (HE) draws huge attention as it provides a way of privacy-preserving computations on encrypted messages. Number Theoretic Transform (NTT), a specialized form of Discrete Fourier Transform (DFT) in the finite field of integers, is the key algorithm that enables fast computation on encrypted ciphertexts in HE. Prior works have accelerated NTT and its inverse transformation on a popular parallel processing platform, GPU, by leveraging DFT optimization techniques. However, these GPU-based studies lack a comprehensive analysis of the primary differences between NTT and DFT or only consider small HE parameters that have tight constraints in the number of arithmetic operations that can be performed without decryption. In this paper, we analyze the algorithmic characteristics of NTT and DFT and assess the performance of NTT when we apply the optimizations that are commonly applicable to both DFT and NTT on modern GPUs. From the analysis, we identify that NTT suffers from severe main-memory bandwidth bottleneck on large HE parameter sets. To tackle the main-memory bandwidth issue, we propose a novel NTT-specific on-the-fly root generation scheme dubbed on-the-fly twiddling (OT). Compared to the baseline radix-2 NTT implementation, after applying all the optimizations, including OT, we achieve 4.2x speedup on a modern GPU.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源