论文标题
NIPQ:基于噪声代理的集成伪定量
NIPQ: Noise proxy-based Integrated Pseudo-Quantization
论文作者
论文摘要
直通估计量(Ste)可以通过近似来实现非差异功能的梯度流,在与量化感知训练(QAT)有关的研究中受到青睐。但是,Ste会在QAT期间产生不稳定的收敛性,从而导致质量降低的精度低。最近,已经提出了伪量化培训,作为使用伪定量噪声而不是Ste来更新可学习参数的替代方法。在这项研究中,我们提出了一种新型的基于噪声代理的综合伪Quantation(NIPQ),该集成拟量化(NIPQ)可以通过在伪量化框架上整合截断的概念来统一支持伪Quantization激活和重量。 NIPQ更新所有量化参数(例如,位宽度和截断边界)以及通过梯度下降而没有Ste ste的网络参数。根据我们的广泛实验,NIPQ在各种视觉和语言应用中都超过了现有的量化算法。
Straight-through estimator (STE), which enables the gradient flow over the non-differentiable function via approximation, has been favored in studies related to quantization-aware training (QAT). However, STE incurs unstable convergence during QAT, resulting in notable quality degradation in low precision. Recently, pseudoquantization training has been proposed as an alternative approach to updating the learnable parameters using the pseudo-quantization noise instead of STE. In this study, we propose a novel noise proxy-based integrated pseudoquantization (NIPQ) that enables unified support of pseudoquantization for both activation and weight by integrating the idea of truncation on the pseudo-quantization framework. NIPQ updates all of the quantization parameters (e.g., bit-width and truncation boundary) as well as the network parameters via gradient descent without STE instability. According to our extensive experiments, NIPQ outperforms existing quantization algorithms in various vision and language applications by a large margin.