选择性网络线性化以有效的私人推理

论文标题

选择性网络线性化以有效的私人推理

Selective Network Linearization for Efficient Private Inference

论文作者

Cho, Minsu, Joshi, Ameya, Garg, Siddharth, Reagen, Brandon, Hegde, Chinmay

论文摘要

私人推论（PI）可以直接对密码安全的数据进行推断。虽然有望解决许多隐私问题，但由于极端的运行，它的使用有限。与明文推断不同，在PI非线性函数（即relu）中，延迟是由拖曳支配的，即瓶颈。因此，实用的PI需要新颖的恢复优化。为了减少PI潜伏期，我们提出了一种基于梯度的算法，该算法在保持预测准确性的同时选择性地线性性地线性性地线性性地线性性。我们在几种标准PI基准测试中评估了算法。结果表明，比当前最新技术状况比当前的状态显示出高达$ 4.25 \％$的准确性（50K的ISO-RELU计数），或者$ 2.2 \ tims $延迟（70 \％）的延迟（70 \％\％）。为了补充经验结果，我们提出了一个“不免费的午餐”定理，该定理阐明了如何以及何时进行网络线性化，同时保持预测准确性。公共代码可在\ url {https://github.com/nyu-dice-lab/selective_network_linearization}中获得。

Private inference (PI) enables inference directly on cryptographically secure data.While promising to address many privacy issues, it has seen limited use due to extreme runtimes. Unlike plaintext inference, where latency is dominated by FLOPs, in PI non-linear functions (namely ReLU) are the bottleneck. Thus, practical PI demands novel ReLU-aware optimizations. To reduce PI latency we propose a gradient-based algorithm that selectively linearizes ReLUs while maintaining prediction accuracy. We evaluate our algorithm on several standard PI benchmarks. The results demonstrate up to $4.25\%$ more accuracy (iso-ReLU count at 50K) or $2.2\times$ less latency (iso-accuracy at 70\%) than the current state of the art and advance the Pareto frontier across the latency-accuracy space. To complement empirical results, we present a "no free lunch" theorem that sheds light on how and when network linearization is possible while maintaining prediction accuracy. Public code is available at \url{https://github.com/NYU-DICE-Lab/selective_network_linearization}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题