论文标题
黑框审核,以进行小组分配变化
Black-Box Audits for Group Distribution Shifts
论文作者
论文摘要
当模型向人们提供决定时,分配变化可能会造成不当的差异。但是,由于模型及其训练集通常是专有的,因此外部实体很难检查分配变化。在本文中,我们介绍并研究了一种黑盒审核方法,以检测分配变化案例,从而导致模型跨人群群体的性能差异。通过扩展成员资格和属性推理攻击中使用的技术(旨在暴露从学习模型中的私人信息) - 我们证明外部审核员可以仅通过查询模型来获得所需的信息来识别这些分布的变化。我们对现实世界数据集的实验结果表明,这种方法是有效的,在检测涉及培训集中人口组不足的转移方面达到了80--100%的AUC-ROC。研究人员和调查记者可以使用我们的工具对专有模型进行非授权审核,并在培训数据集中暴露出不足的案例。
When a model informs decisions about people, distribution shifts can create undue disparities. However, it is hard for external entities to check for distribution shift, as the model and its training set are often proprietary. In this paper, we introduce and study a black-box auditing method to detect cases of distribution shift that lead to a performance disparity of the model across demographic groups. By extending techniques used in membership and property inference attacks -- which are designed to expose private information from learned models -- we demonstrate that an external auditor can gain the information needed to identify these distribution shifts solely by querying the model. Our experimental results on real-world datasets show that this approach is effective, achieving 80--100% AUC-ROC in detecting shifts involving the underrepresentation of a demographic group in the training set. Researchers and investigative journalists can use our tools to perform non-collaborative audits of proprietary models and expose cases of underrepresentation in the training datasets.