论文标题

部分可观测时空混沌系统的无模型预测

Semi-supervised Contrastive Outlier removal for Pseudo Expectation Maximization (SCOPE)

论文作者

Menon, Sumeet, Chapman, David

论文摘要

半监督学习是通过将一个小标签的数据集与大概更大的未标记数据集相结合,训练准确的预测模型的问题。已经开发了许多半监督深度学习的方法,包括伪标记,一致性正规化和对比度学习技术。然而,假标签方法非常容易混淆,其中假定错误的伪标记在早期迭代中是真正的标签,从而导致该模型增强其先前的偏见,从而无法推广到强大的预测性能。我们提出了一种新的方法,通过一种我们形容为伪预期最大化(范围)的半监督对比外相群删除的方法来抑制混杂的错误。像基本的伪标记一样,范围与预期最大化有关(EM),这是一个潜在的可变框架,可以扩展到理解群集实现深度半监督算法。但是,与基本的伪标记不同,鉴于该模型,范围不足以充分考虑未标记的样品的概率,范围引入了一个异常抑制项,旨在改善EM迭代的行为,因为在异常存在的情况下,鉴于有歧视的DNN骨架。我们的结果表明,范围极大地提高了基线的半监督分类精度,并且当结合一致性正则化时,使用250和4000个标记的样品,将半监督的CIFAR-10分类任务获得了最高报告的准确性。此外,我们表明范围通过修剪错误的高信心伪标记样品来降低伪标记迭代期间混杂误差的普遍性,这些样本否则会污染随后的重新迭代中标记的设置。

Semi-supervised learning is the problem of training an accurate predictive model by combining a small labeled dataset with a presumably much larger unlabeled dataset. Many methods for semi-supervised deep learning have been developed, including pseudolabeling, consistency regularization, and contrastive learning techniques. Pseudolabeling methods however are highly susceptible to confounding, in which erroneous pseudolabels are assumed to be true labels in early iterations, thereby causing the model to reinforce its prior biases and thereby fail to generalize to strong predictive performance. We present a new approach to suppress confounding errors through a method we describe as Semi-supervised Contrastive Outlier removal for Pseudo Expectation Maximization (SCOPE). Like basic pseudolabeling, SCOPE is related to Expectation Maximization (EM), a latent variable framework which can be extended toward understanding cluster-assumption deep semi-supervised algorithms. However, unlike basic pseudolabeling which fails to adequately take into account the probability of the unlabeled samples given the model, SCOPE introduces an outlier suppression term designed to improve the behavior of EM iteration given a discrimination DNN backbone in the presence of outliers. Our results show that SCOPE greatly improves semi-supervised classification accuracy over a baseline, and furthermore when combined with consistency regularization achieves the highest reported accuracy for the semi-supervised CIFAR-10 classification task using 250 and 4000 labeled samples. Moreover, we show that SCOPE reduces the prevalence of confounding errors during pseudolabeling iterations by pruning erroneous high-confidence pseudolabeled samples that would otherwise contaminate the labeled set in subsequent retraining iterations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源