Sugeno积分的机器学习：二进制分类的情况

论文标题

Sugeno积分的机器学习：二进制分类的情况

Machine Learning with the Sugeno Integral: The Case of Binary Classification

论文作者

Abbaszadeh, Sadegh, Hüllermeier, Eyke

论文摘要

在本文中，我们详细阐述了Sugeno积分在机器学习背景下的使用。更具体地说，我们提出了一种用于二进制分类的方法，其中Sugeno积分被用作一个聚合函数，该函数将实例的几种局部评估结合在一起，与不同的特征或测量有关，与单个全局评估相结合。由于Sugeno积分的特定性质，该方法特别适合从序数数据中学习，即从顺序尺度进行测量时。到目前为止，这个话题在机器学习中尚未受到很多关注。学习问题的核心本身包括识别Sugeno积分不可或缺的能力。为了解决这个问题，我们根据线性编程开发了一种算法。该算法还包括一种适合将原始特征值转换为本地评估（本地实用程序得分）的技术，以及一种调整全局评估阈值的方法。为了控制分类器的灵活性并减轻过度拟合培训数据的问题，我们将方法推广到$ k $ - 及时的能力，其中$ k $在其中扮演了学习者的超级参数。我们提出了实验研究，其中我们将方法与几个基准数据集的竞争方法进行了比较。

In this paper, we elaborate on the use of the Sugeno integral in the context of machine learning. More specifically, we propose a method for binary classification, in which the Sugeno integral is used as an aggregation function that combines several local evaluations of an instance, pertaining to different features or measurements, into a single global evaluation. Due to the specific nature of the Sugeno integral, this approach is especially suitable for learning from ordinal data, that is, when measurements are taken from ordinal scales. This is a topic that has not received much attention in machine learning so far. The core of the learning problem itself consists of identifying the capacity underlying the Sugeno integral. To tackle this problem, we develop an algorithm based on linear programming. The algorithm also includes a suitable technique for transforming the original feature values into local evaluations (local utility scores), as well as a method for tuning a threshold on the global evaluation. To control the flexibility of the classifier and mitigate the problem of overfitting the training data, we generalize our approach toward $k$-maxitive capacities, where $k$ plays the role of a hyper-parameter of the learner. We present experimental studies, in which we compare our method with competing approaches on several benchmark data sets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题