论文标题

铅笔:带有嘈杂标签的深度学习

PENCIL: Deep Learning with Noisy Labels

论文作者

Yi, Kun, Wang, Guo-Hua, Wu, Jianxin

论文摘要

深度学习在各种计算机视觉任务中都取得了出色的表现,但是需要大量带有干净标签的培训示例。很容易收集带有嘈杂标签的数据集,但是这种噪音使网络非常贴合,准确性急剧下降。为了解决此问题,我们提出了一个称为铅笔的端到端框架,该框架可以更新网络参数和标签估计作为标签分布。铅笔独立于骨干网络结构,不需要辅助清洁数据集或有关噪声的先前信息,因此,它比现有方法更一般和强大,并且易于应用。甚至可以反复使用铅笔以获得更好的性能。铅笔在具有不同噪声类型和噪声速率的合成数据集和现实世界数据集上以大幅度的优于先前的最先进方法。铅笔还可以通过在骨干网络上添加简单的注意力结构来有效地在多标签分类任务中。实验表明,铅笔在干净的数据集上也很健壮。

Deep learning has achieved excellent performance in various computer vision tasks, but requires a lot of training examples with clean labels. It is easy to collect a dataset with noisy labels, but such noise makes networks overfit seriously and accuracies drop dramatically. To address this problem, we propose an end-to-end framework called PENCIL, which can update both network parameters and label estimations as label distributions. PENCIL is independent of the backbone network structure and does not need an auxiliary clean dataset or prior information about noise, thus it is more general and robust than existing methods and is easy to apply. PENCIL can even be used repeatedly to obtain better performance. PENCIL outperforms previous state-of-the-art methods by large margins on both synthetic and real-world datasets with different noise types and noise rates. And PENCIL is also effective in multi-label classification tasks through adding a simple attention structure on backbone networks. Experiments show that PENCIL is robust on clean datasets, too.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源