论文标题

repfair-gan:使用梯度剪辑来缓解gan中的表示偏差

RepFair-GAN: Mitigating Representation Bias in GANs Using Gradient Clipping

论文作者

Kenfack, Patrik Joslin, Sabbagh, Kamil, Rivera, Adín Ramírez, Khan, Adil

论文摘要

公平性已成为机器学习(ML)的许多领域(例如分类,自然语言处理和生成对抗网络(GAN))的基本问题。在这项研究工作中,我们研究了甘斯的不公平性。我们正式根据共享相同受保护属性(性别,种族等)的样本的分布来正式为生成模型定义一个新的公平概念。定义的公平概念(表示公平)要求在测试时间分配敏感属性是统一的,尤其是对于GAN模型,我们表明,即使数据集包含同样代表的组,即即使在测试时间内,数据集包含同样代表的组,即在测试时间内生成一组样本,即使生成一组样本。在这项工作中,我们阐明了这种表示形式偏见的来源以及一种直接的方法来克服这个问题。我们首先在两个广泛使用的数据集(MNIST,SVHN)上表明,当一个组的梯度的规范在歧视训练中比另一组更重要时,生成器在测试时比另一组更喜欢从一个组中采样数据。然后,我们表明,与现有模型相比,在代表性公平方面,在培训期间,通过在训练过程中进行歧视者的划分规范剪辑来控制小组的梯度规范,从而导致更公平的数据生成。

Fairness has become an essential problem in many domains of Machine Learning (ML), such as classification, natural language processing, and Generative Adversarial Networks (GANs). In this research effort, we study the unfairness of GANs. We formally define a new fairness notion for generative models in terms of the distribution of generated samples sharing the same protected attributes (gender, race, etc.). The defined fairness notion (representational fairness) requires the distribution of the sensitive attributes at the test time to be uniform, and, in particular for GAN model, we show that this fairness notion is violated even when the dataset contains equally represented groups, i.e., the generator favors generating one group of samples over the others at the test time. In this work, we shed light on the source of this representation bias in GANs along with a straightforward method to overcome this problem. We first show on two widely used datasets (MNIST, SVHN) that when the norm of the gradient of one group is more important than the other during the discriminator's training, the generator favours sampling data from one group more than the other at test time. We then show that controlling the groups' gradient norm by performing group-wise gradient norm clipping in the discriminator during the training leads to a more fair data generation in terms of representational fairness compared to existing models while preserving the quality of generated samples.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源