论文标题
通过数据重新获得影响,以无效的成本实现公平性
Achieving Fairness at No Utility Cost via Data Reweighing with Influence
论文作者
论文摘要
随着算法治理的快速发展,公平性已成为机器学习模型的强制性,以抑制无意的歧视。在本文中,我们着重于实现公平性的预处理方面,并提出了一种数据重新尊敬的方法,该方法仅在训练阶段调整样本的重量。与通常为每个(子)组分配均匀权重的大多数重新呼能的方法不同,我们对每个训练样本在与公平相关的数量和预测效用方面的影响进行了详细建模,并根据公平和公用事业的约束对基于影响的个人权重进行计算。实验结果表明,以前的方法以不可忽略的实用性成本达到公平性,而为了取得重大优势,我们的方法可以从经验上释放权衡并获得无需成本的公平就可以平等机会。与多个实际表格数据集中的基线方法相比,我们通过香草分类器和标准培训过程证明了通过香草分类器和标准培训过程的公平性。可在https://github.com/brandeis-machine-learning/influence-fairness上找到代码。
With the fast development of algorithmic governance, fairness has become a compulsory property for machine learning models to suppress unintentional discrimination. In this paper, we focus on the pre-processing aspect for achieving fairness, and propose a data reweighing approach that only adjusts the weight for samples in the training phase. Different from most previous reweighing methods which usually assign a uniform weight for each (sub)group, we granularly model the influence of each training sample with regard to fairness-related quantity and predictive utility, and compute individual weights based on influence under the constraints from both fairness and utility. Experimental results reveal that previous methods achieve fairness at a non-negligible cost of utility, while as a significant advantage, our approach can empirically release the tradeoff and obtain cost-free fairness for equal opportunity. We demonstrate the cost-free fairness through vanilla classifiers and standard training processes, compared to baseline methods on multiple real-world tabular datasets. Code available at https://github.com/brandeis-machine-learning/influence-fairness.