逐步识别针对部分标签学习的真实标签

论文标题

逐步识别针对部分标签学习的真实标签

Progressive Identification of True Labels for Partial-Label Learning

论文作者

Lv, Jiaqi, Xu, Miao, Feng, Lei, Niu, Gang, Geng, Xin, Sugiyama, Masashi

论文摘要

部分标签学习（PLL）是一个典型的弱监督学习问题，每个培训实例都配备了一组候选标签，其中只有一个是真正的标签。大多数现有的方法精心设计的学习目标是必须以特定方式解决的受限优化，使其计算复杂性成为扩展到大数据的瓶颈。本文的目的是提出一个具有灵活性和优化算法的新型PLL框架。更具体地说，我们提出了一个新的分类风险估计器，理论上分析了分类器的矛盾，并建立了估计误差。然后，我们提出了一种逐步识别算法，以近似最小化所提出的风险估计器，其中模型的更新和真实标签的识别是以无缝的方式进行的。所得算法是独立于模型的且无关的，并且与随机优化兼容。彻底的实验证明了它为新的最新状态设定了新的状态。

Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label. Most existing methods elaborately designed learning objectives as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data. The goal of this paper is to propose a novel framework of PLL with flexibility on the model and optimization algorithm. More specifically, we propose a novel estimator of the classification risk, theoretically analyze the classifier-consistency, and establish an estimation error bound. Then we propose a progressive identification algorithm for approximately minimizing the proposed risk estimator, where the update of the model and identification of true labels are conducted in a seamless manner. The resulting algorithm is model-independent and loss-independent, and compatible with stochastic optimization. Thorough experiments demonstrate it sets the new state of the art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题