论文标题
Dipietro-Hazari Kappa:通过注释评估标签质量的新型指标
DiPietro-Hazari Kappa: A Novel Metric for Assessing Labeling Quality via Annotation
论文作者
论文摘要
数据是现代机器学习的关键组成部分,但是评估数据标签质量的统计数据在文献中仍然很少。在这里,我们介绍了Dipietro-Hazari Kappa,这是一种新型的统计指标,用于评估人类注释中建议的数据集标签的质量。 Dipietro-Hazari Kappa植根于经典Fleiss的Kappa衡量通道间一致性的量度,量化了经验注释者协议差异的差异,该协议的差异已超过随机机会。在转向我们对Dipietro-Hazari Kappa的推导之前,我们对Fleiss的Kappa进行了彻底的理论检查。最后,我们以矩阵公式和一组程序说明进行结论,以方便计算实现。
Data is a key component of modern machine learning, but statistics for assessing data label quality remain sparse in literature. Here, we introduce DiPietro-Hazari Kappa, a novel statistical metric for assessing the quality of suggested dataset labels in the context of human annotation. Rooted in the classical Fleiss's Kappa measure of inter-annotator agreement, the DiPietro-Hazari Kappa quantifies the the empirical annotator agreement differential that was attained above random chance. We offer a thorough theoretical examination of Fleiss's Kappa before turning to our derivation of DiPietro-Hazari Kappa. Finally, we conclude with a matrix formulation and set of procedural instructions for easy computational implementation.