论文标题
具有歧管调查的过渡矩阵估计的实例依赖性标签 - 噪声学习
Instance-Dependent Label-Noise Learning with Manifold-Regularized Transition Matrix Estimation
论文作者
论文摘要
在标签 - 噪声学习中,估计过渡矩阵引起了越来越多的关注,因为矩阵在构建统计上一致的分类器中起着重要作用。但是,估计过渡矩阵t(x),其中x表示实例非常具有挑战性,因为它在实例依赖性噪声(IDN)下是无法识别的。为了解决这个问题,我们注意到,存在心理和生理证据,表明我们人类更有可能注释与同一类相似的外观的实例,因此,较差的质量或模棱两可的实例更容易被错误地标记为相关或相同的类别。因此,我们提出了t(x)几何形状的假设:“两个实例越接近,它们相应的过渡矩阵应该是越相似”。更具体地说,我们将上述假设提出为流形嵌入,以有效地降低t(x)的自由度,并在实践中可以稳定地估计。所提出的歧管调查技术通过直接减少估计误差而无需损害t(x)估计问题的近似误差而起作用。对四个合成和两个现实世界数据集的实验评估表明,我们的方法优于在具有挑战性的IDN下的标签 - 噪声学习的最新方法。
In label-noise learning, estimating the transition matrix has attracted more and more attention as the matrix plays an important role in building statistically consistent classifiers. However, it is very challenging to estimate the transition matrix T(x), where x denotes the instance, because it is unidentifiable under the instance-dependent noise(IDN). To address this problem, we have noticed that, there are psychological and physiological evidences showing that we humans are more likely to annotate instances of similar appearances to the same classes, and thus poor-quality or ambiguous instances of similar appearances are easier to be mislabeled to the correlated or same noisy classes. Therefore, we propose assumption on the geometry of T(x) that "the closer two instances are, the more similar their corresponding transition matrices should be". More specifically, we formulate above assumption into the manifold embedding, to effectively reduce the degree of freedom of T(x) and make it stably estimable in practice. The proposed manifold-regularized technique works by directly reducing the estimation error without hurting the approximation error about the estimation problem of T(x). Experimental evaluations on four synthetic and two real-world datasets demonstrate that our method is superior to state-of-the-art approaches for label-noise learning under the challenging IDN.