论文标题
有偏见的程序员?还是偏见的数据?在操作AI伦理方面的现场实验
Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics
论文作者
论文摘要
为什么会出现偏见的预测?哪些干预措施可以防止它们?我们评估了820万算法的数学性能预测,从$ \ $ \ $ \ $ 400 AI工程师,每个工程师都在随机分配的实验条件下开发了算法。我们的治疗武器修改了程序员的激励措施,培训数据,意识和/或AI伦理技术知识。然后,我们使用算法输入的随机审核操纵和20K受试者的地面数学性能评估样本外的算法预测。我们发现,偏见的预测主要是由偏见的培训数据引起的。但是,更好的培训数据的好处是三分之一,这是通过一种新颖的经济机制:工程师更大的努力,并且在获得更好的培训数据时对激励措施的反应更加敏感。我们还评估了绩效如何随程序员的人口特征而变化,以及它们在关于性别和职业的隐性偏见(IAT)的心理测试中的表现。我们没有发现女性,少数和低级工程师在其法规中表现出较低的偏见或歧视。但是,我们确实发现预测错误在人口统计组中是相关的,这可以通过跨人口统计学平均来改善绩效。最后,我们量化了实用管理或政策干预措施的收益和权衡,例如技术建议,简单的提醒和改善算法偏见的激励措施。
Why do biased predictions arise? What interventions can prevent them? We evaluate 8.2 million algorithmic predictions of math performance from $\approx$400 AI engineers, each of whom developed an algorithm under a randomly assigned experimental condition. Our treatment arms modified programmers' incentives, training data, awareness, and/or technical knowledge of AI ethics. We then assess out-of-sample predictions from their algorithms using randomized audit manipulations of algorithm inputs and ground-truth math performance for 20K subjects. We find that biased predictions are mostly caused by biased training data. However, one-third of the benefit of better training data comes through a novel economic mechanism: Engineers exert greater effort and are more responsive to incentives when given better training data. We also assess how performance varies with programmers' demographic characteristics, and their performance on a psychological test of implicit bias (IAT) concerning gender and careers. We find no evidence that female, minority and low-IAT engineers exhibit lower bias or discrimination in their code. However, we do find that prediction errors are correlated within demographic groups, which creates performance improvements through cross-demographic averaging. Finally, we quantify the benefits and tradeoffs of practical managerial or policy interventions such as technical advice, simple reminders, and improved incentives for decreasing algorithmic bias.