论文标题

通过进化二线优化处理不平衡的分类问题

Handling Imbalanced Classification Problems With Support Vector Machines via Evolutionary Bilevel Optimization

论文作者

Rosales-Pérez, Alejandro, García, Salvador, Herrera, Francisco

论文摘要

支持向量机(SVM)是流行的学习算法,用于处理二进制分类问题。他们传统上承担每个班级的同等分类成本;但是,现实世界中的问题可能具有不平衡的课程分布。本文介绍了EBCS-SVM:进化的二重性成本敏感SVM。 EBCS-SVM通过同时学习支持向量并优化SVM超级参数来处理不平衡的分类问题,这包括内核参数和错误分类成本。由此产生的优化问题是一个双重问题,其中较低级别决定了支撑向量,而高参数则确定上层。使用上层的进化算法(EA)和下层的顺序最小优化(SMO)来解决此优化问题。这两种方法以嵌套方式起作用,即最佳支持向量有助于指导搜索超参数,并且较低级别是根据以前的成功解决方案初始化的。使用70个不平衡分类数据集评估所提出的方法,并与几种最新方法进行比较。通过贝叶斯测试支持的实验结果,提供了EBCS-SVM有效性时,在使用高度不平衡的数据集时提供了证据。

Support vector machines (SVMs) are popular learning algorithms to deal with binary classification problems. They traditionally assume equal misclassification costs for each class; however, real-world problems may have an uneven class distribution. This article introduces EBCS-SVM: evolutionary bilevel cost-sensitive SVMs. EBCS-SVM handles imbalanced classification problems by simultaneously learning the support vectors and optimizing the SVM hyperparameters, which comprise the kernel parameter and misclassification costs. The resulting optimization problem is a bilevel problem, where the lower level determines the support vectors and the upper level the hyperparameters. This optimization problem is solved using an evolutionary algorithm (EA) at the upper level and sequential minimal optimization (SMO) at the lower level. These two methods work in a nested fashion, that is, the optimal support vectors help guide the search of the hyperparameters, and the lower level is initialized based on previous successful solutions. The proposed method is assessed using 70 datasets of imbalanced classification and compared with several state-of-the-art methods. The experimental results, supported by a Bayesian test, provided evidence of the effectiveness of EBCS-SVM when working with highly imbalanced datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源