内核SVM分类的特征子集选择通过混合构成优化

论文标题

内核SVM分类的特征子集选择通过混合构成优化

Feature subset selection for kernel SVM classification via mixed-integer optimization

论文作者

Tamura, Ryuta, Takano, Yuichi, Miyashiro, Ryuhei

论文摘要

我们研究了用于非线性内核支持向量机（SVM）中二进制分类的混合智能优化（MIO）方法。首次提出了1970年代线性回归的建议，这种方法最近随着优化算法和计算机硬件的进步而引起了人们的关注。本文的目的是建立一种MIO方法，用于选择内核SVM分类的最佳特征子集。为了衡量子集选择的性能，我们使用内核 - 靶标对齐，这是高维特征空间中两个响应类别的质心之间的距离。我们建议基于特征子集选择的内核键盘对准的混合组合线性优化（MILO）公式，并且可以使用优化软件将此Milo问题求解为最佳性。我们还得出了Milo问题的简化版本，以加速我们的Milo计算。实验结果显示了我们的Milo公式的良好计算效率以及减少的问题。此外，我们的方法通常可以优于基于线性-SVM的MILO公式和预测性能中的递归功能消除，尤其是在相对较少的数据实例时。

We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification. First proposed for linear regression in the 1970s, this approach has recently moved into the spotlight with advances in optimization algorithms and computer hardware. The goal of this paper is to establish an MIO approach for selecting the best subset of features for kernel SVM classification. To measure the performance of subset selection, we use the kernel-target alignment, which is the distance between the centroids of two response classes in a high-dimensional feature space. We propose a mixed-integer linear optimization (MILO) formulation based on the kernel-target alignment for feature subset selection, and this MILO problem can be solved to optimality using optimization software. We also derive a reduced version of the MILO problem to accelerate our MILO computations. Experimental results show good computational efficiency for our MILO formulation with the reduced problem. Moreover, our method can often outperform the linear-SVM-based MILO formulation and recursive feature elimination in prediction performance, especially when there are relatively few data instances.

下载PDF全文

下载文献需遵守相关版权规定

论文标题