通过样品标签融合从多个注释噪声标签中学习

论文标题

通过样品标签融合从多个注释噪声标签中学习

Learning from Multiple Annotator Noisy Labels via Sample-wise Label Fusion

论文作者

Gao, Zhengqi, Sun, Fan-Keng, Yang, Mingran, Ren, Sucheng, Xiong, Zikai, Engeler, Marc, Burazer, Antonio, Wildling, Linda, Daniel, Luca, Boning, Duane S.

论文摘要

数据是现代深度学习的核心。监督学习的令人印象深刻的表现建立在大量准确标记的数据基础上。但是，在某些实际应用中，准确的标签可能是不可行的。取而代之的是，为每个数据样本提供了多个注释者提供多个嘈杂标签（而不是一个精确的标签）。在这样的嘈杂培训数据集上学习分类器是一项具有挑战性的任务。以前的方法通常假设所有数据示例共享与注释误差相关的相同的参数集，而我们证明标签错误学习应既是注释者，又依赖于数据样本。在这一观察结果的动机上，我们提出了一种新颖的学习算法。与MNIST，CIFAR-100和Imagenet-100的几种最新基线方法相比，该方法显示出优势。我们的代码可在以下网址获得：https：//github.com/zhengqigao/learning-from-multiple-annotator-noisy-labels。

Data lies at the core of modern deep learning. The impressive performance of supervised learning is built upon a base of massive accurately labeled data. However, in some real-world applications, accurate labeling might not be viable; instead, multiple noisy labels (instead of one accurate label) are provided by several annotators for each data sample. Learning a classifier on such a noisy training dataset is a challenging task. Previous approaches usually assume that all data samples share the same set of parameters related to annotator errors, while we demonstrate that label error learning should be both annotator and data sample dependent. Motivated by this observation, we propose a novel learning algorithm. The proposed method displays superiority compared with several state-of-the-art baseline methods on MNIST, CIFAR-100, and ImageNet-100. Our code is available at: https://github.com/zhengqigao/Learning-from-Multiple-Annotator-Noisy-Labels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题