分发数据如何损害半监督的学习

论文标题

分发数据如何损害半监督的学习

How Out-of-Distribution Data Hurts Semi-Supervised Learning

论文作者

Zhao, Xujiang, Krishnateja, Killamsetty, Iyer, Rishabh, Chen, Feng

论文摘要

由于未标记的数据表示，最近的半监督学习算法表现出更大的成功，总体表现更高。尽管如此，最近的研究表明，当未标记的集合包含分发示例（OOD）时，SSL算法的性能可能会降低。这项工作解决了以下问题：分发（OOD）数据如何不利影响半监督的学习算法？为了回答这个问题，我们研究了OOD对SSL算法的负面影响的关键原因。特别是，我们发现1）某些接近决策边界的OOD数据实例对性能的影响比较远的绩效具有更大的影响，而2）批处理（BN）（一个流行的模块），一个流行的模块，可能会降低而不是在未标记的集合包含OOD时提高性能。在这种情况下，我们开发了一个统一的加权强大的SSL框架，可以很容易地扩展到许多现有的SSL算法并提高对OOD的鲁棒性。更具体地说，我们开发了一种有效的双层优化算法，该算法可以容纳物镜和规模的高阶近似值，以了解多个内部优化步骤，以学习大量的权重参数，同时比现有的BI级优化的现有低阶近似值。此外，我们对BN步骤中遥远的OOD的影响进行了理论研究，并提出了加权批归归归式（WBN）程序以提高性能。最后，我们讨论方法与低阶近似技术之间的联系。我们对合成和现实世界数据集的实验表明，与四种最先进的SSL策略相比，我们提出的方法可显着增强针对OOD的四种代表性SSL算法的鲁棒性。

Recent semi-supervised learning algorithms have demonstrated greater success with higher overall performance due to better-unlabeled data representations. Nonetheless, recent research suggests that the performance of the SSL algorithm can be degraded when the unlabeled set contains out-of-distribution examples (OODs). This work addresses the following question: How do out-of-distribution (OOD) data adversely affect semi-supervised learning algorithms? To answer this question, we investigate the critical causes of OOD's negative effect on SSL algorithms. In particular, we found that 1) certain kinds of OOD data instances that are close to the decision boundary have a more significant impact on performance than those that are further away, and 2) Batch Normalization (BN), a popular module, may degrade rather than improve performance when the unlabeled set contains OODs. In this context, we developed a unified weighted robust SSL framework that can be easily extended to many existing SSL algorithms and improve their robustness against OODs. More specifically, we developed an efficient bi-level optimization algorithm that could accommodate high-order approximations of the objective and scale to multiple inner optimization steps to learn a massive number of weight parameters while outperforming existing low-order approximations of bi-level optimization. Further, we conduct a theoretical study of the impact of faraway OODs in the BN step and propose a weighted batch normalization (WBN) procedure for improved performance. Finally, we discuss the connection between our approach and low-order approximation techniques. Our experiments on synthetic and real-world datasets demonstrate that our proposed approach significantly enhances the robustness of four representative SSL algorithms against OODs compared to four state-of-the-art robust SSL strategies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题