论文标题
贝叶斯人众包和限制
Bayesian Crowdsourcing with Constraints
论文作者
论文摘要
通过利用人类注释者的人群,众包已经成为有效标记大型数据集并执行各种学习任务的强大范式。当获得有关数据的其他信息时,可以很好地激励人类注释者的半监督众包方法,以增强人类注释者的标签聚集。这项工作涉及半监督众包的分类,在两个半义务的制度下:a)标签约束,为数据子集提供了基本真相标签; b)可能更容易获得实例级别的约束,这表明数据对之间的关系。针对每个制度开发了基于变异推断的贝叶斯算法,与无监督的众包相比,它们在分析和经验上在几个众群数据集中都在分析和经验验证了它们的量化性能。
Crowdsourcing has emerged as a powerful paradigm for efficiently labeling large datasets and performing various learning tasks, by leveraging crowds of human annotators. When additional information is available about the data, semi-supervised crowdsourcing approaches that enhance the aggregation of labels from human annotators are well motivated. This work deals with semi-supervised crowdsourced classification, under two regimes of semi-supervision: a) label constraints, that provide ground-truth labels for a subset of data; and b) potentially easier to obtain instance-level constraints, that indicate relationships between pairs of data. Bayesian algorithms based on variational inference are developed for each regime, and their quantifiably improved performance, compared to unsupervised crowdsourcing, is analytically and empirically validated on several crowdsourcing datasets.