审查处理分类问题中班级失效的方法

论文标题

审查处理分类问题中班级失效的方法

Review of Methods for Handling Class-Imbalanced in Classification Problems

论文作者

Rawat, Satyendra Singh, Mishra, Amit Kumar

论文摘要

使用偏斜或不平衡数据集的学习分类器有时会导致分类问题；这是一个严重的问题。在某些情况下，一个类包含大多数示例，而另一个类别通常是更重要的类别，但示例比例较小。使用这种数据可能会使许多精心设计的机器学习系统无效。高训练保真度是一个用来描述偏见与班级所有其他实例的术语。解决这个问题的所有可能补救措施的最佳方法通常是从少数群体中获利。本文探讨了最广泛使用的方法，用于解决学习问题的学习问题，包括数据级，算法级别，混合，混合，成本敏感的学习和深度学习等，包括其优势和局限性。分类器的效率和性能使用无数的评估指标进行评估。

Learning classifiers using skewed or imbalanced datasets can occasionally lead to classification issues; this is a serious issue. In some cases, one class contains the majority of examples while the other, which is frequently the more important class, is nevertheless represented by a smaller proportion of examples. Using this kind of data could make many carefully designed machine-learning systems ineffective. High training fidelity was a term used to describe biases vs. all other instances of the class. The best approach to all possible remedies to this issue is typically to gain from the minority class. The article examines the most widely used methods for addressing the problem of learning with a class imbalance, including data-level, algorithm-level, hybrid, cost-sensitive learning, and deep learning, etc. including their advantages and limitations. The efficiency and performance of the classifier are assessed using a myriad of evaluation metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题