论文标题
分类器:现实是否与期望有所不同?
Debiasing classifiers: is reality at variance with expectation?
论文作者
论文摘要
我们提出了一项关于分类器的辩论方法的实证研究,表明辩论者通常在实践中失败了,无法概括样本外,实际上可以使公平性更糟而不是更好。严格评估歧义治疗效果需要超出通常进行的大量交叉验证。我们证明,这一现象可以是由于偏见 - 差异权衡的结果来解释的,这是由于施加公平限制而需要增加的差异。随访实验验证了理论预测,即估计方差在很大程度上取决于受保护类别的基本速率。考虑到公平性 - 表现权衡证明了违反直觉的观念,即部分证券实际上可以在样本外数据实践中产生更好的结果。
We present an empirical study of debiasing methods for classifiers, showing that debiasers often fail in practice to generalize out-of-sample, and can in fact make fairness worse rather than better. A rigorous evaluation of the debiasing treatment effect requires extensive cross-validation beyond what is usually done. We demonstrate that this phenomenon can be explained as a consequence of bias-variance trade-off, with an increase in variance necessitated by imposing a fairness constraint. Follow-up experiments validate the theoretical prediction that the estimation variance depends strongly on the base rates of the protected class. Considering fairness--performance trade-offs justifies the counterintuitive notion that partial debiasing can actually yield better results in practice on out-of-sample data.