评估NLP技术的多样性，公平性和包容性：印度语言的案例研究

论文标题

评估NLP技术的多样性，公平性和包容性：印度语言的案例研究

Evaluating the Diversity, Equity and Inclusion of NLP Technology: A Case Study for Indian Languages

论文作者

Khanuja, Simran, Ruder, Sebastian, Talukdar, Partha

论文摘要

为了使NLP技术广泛适用，公平和有用，它需要为世界语言中的各种说话者提供服务，即公平，即对任何特定的语言都不过分偏见，并包含所有用户，尤其是在低资产阶级的情况下，在该设置中，在该设置的情况下很常见。在本文中，我们提出了一个评估范式，该范式评估了所有三个维度的NLP技术。尽管多样性和包容性在最近的文献中引起了人们的关注，但目前尚未探索权益。我们建议使用GINI系数（用于估计社会财富不平等的公认指标）来解决这一差距。使用我们的范式，我们强调了印度语言（一种语言大而多样化的，具有多样的说话者人口）的当前技术的困扰状态，这是所有三个维度的。为了改善这些指标，我们证明了特定于区域选择在模型构建和数据集创建中的重要性，更重要的是，在微调过程中提出了一种新颖的，可普遍的方法来最佳资源分配。最后，我们讨论了减轻这些偏见的步骤，并鼓励社区在建立语言多样化和公平的技术时采用多方面的评估。

In order for NLP technology to be widely applicable, fair, and useful, it needs to serve a diverse set of speakers across the world's languages, be equitable, i.e., not unduly biased towards any particular language, and be inclusive of all users, particularly in low-resource settings where compute constraints are common. In this paper, we propose an evaluation paradigm that assesses NLP technologies across all three dimensions. While diversity and inclusion have received attention in recent literature, equity is currently unexplored. We propose to address this gap using the Gini coefficient, a well-established metric used for estimating societal wealth inequality. Using our paradigm, we highlight the distressed state of current technologies for Indian (IN) languages (a linguistically large and diverse set, with a varied speaker population), across all three dimensions. To improve upon these metrics, we demonstrate the importance of region-specific choices in model building and dataset creation, and more importantly, propose a novel, generalisable approach to optimal resource allocation during fine-tuning. Finally, we discuss steps to mitigate these biases and encourage the community to employ multi-faceted evaluation when building linguistically diverse and equitable technologies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题