论文标题
在压缩模型中表征偏差
Characterising Bias in Compressed Models
论文作者
论文摘要
修剪和量化的普及和广泛使用是由将深层神经网络部署到严格延迟,记忆和能量需求的环境中的严重资源限制所驱动的。这些技术可实现高水平的压缩,对顶线指标(TOP-1和前5个精度)的影响可忽略不计。但是,总体准确性在一小部分示例中隐藏了不成比例的误差。我们称此子集压缩为标识的示例(CIE)。我们进一步确定,对于CIE示例,压缩放大了现有的算法偏差。修剪不成比例地影响代表性不足的特征的性能,这通常与公平考虑。鉴于CIE是一个相对较小的子集,但在模型中是误差的重要贡献,因此我们建议它用作循环审核工具,以浮出数据集的可拖动子集,以进一步检查或注释域专家。我们提供定性和定量支持,使CIE浮出水面的数据分布中最具挑战性的示例。
The popularity and widespread use of pruning and quantization is driven by the severe resource constraints of deploying deep neural networks to environments with strict latency, memory and energy requirements. These techniques achieve high levels of compression with negligible impact on top-line metrics (top-1 and top-5 accuracy). However, overall accuracy hides disproportionately high errors on a small subset of examples; we call this subset Compression Identified Exemplars (CIE). We further establish that for CIE examples, compression amplifies existing algorithmic bias. Pruning disproportionately impacts performance on underrepresented features, which often coincides with considerations of fairness. Given that CIE is a relatively small subset but a great contributor of error in the model, we propose its use as a human-in-the-loop auditing tool to surface a tractable subset of the dataset for further inspection or annotation by a domain expert. We provide qualitative and quantitative support that CIE surfaces the most challenging examples in the data distribution for human-in-the-loop auditing.