论文标题
基于二次多形分离的半监督异常检测
Semi-Supervised Anomaly Detection Based on Quadratic Multiform Separation
论文作者
论文摘要
在本文中,我们提出了一种新型的半监督异常检测方法(SSAD)。我们的分类器命名为QMS22,因为其成立的日期为2022年,该框架是二次多形分离(QMS)的框架,这是一个最近引入的分类模型。 QMS22通过解决涉及训练集和原始问题的测试集的多类分类问题来解决SSAD。分类问题有意包括带有重叠样本的类。其中一个类别包含普通样品和离群值的混合物,所有其他类别仅包含普通样品。然后,使用分类问题的结果计算测试集中每个样本中的每个样本。我们还使用来自龙骨存储库中的95个基准不平衡数据集对QMS22的性能评估对顶级性能分类器。这些分类器是BRM(包装随机矿工),Ockra(具有随机投影特征算法的单级K-均值),ISOF(隔离林)和OCSVM(单级支持向量机)。通过在接收器操作特征曲线的曲线下使用该区域作为性能度量,QMS22明显优于ISOF和OCSVM。此外,Wilcoxon签署的秩检验表明,在针对BRM和QMS22对OCKRA的QMS22测试时,没有统计学上的显着差异。
In this paper we propose a novel method for semi-supervised anomaly detection (SSAD). Our classifier is named QMS22 as its inception was dated 2022 upon the framework of quadratic multiform separation (QMS), a recently introduced classification model. QMS22 tackles SSAD by solving a multi-class classification problem involving both the training set and the test set of the original problem. The classification problem intentionally includes classes with overlapping samples. One of the classes contains mixture of normal samples and outliers, and all other classes contain only normal samples. An outlier score is then calculated for every sample in the test set using the outcome of the classification problem. We also include performance evaluation of QMS22 against top performing classifiers using ninety-five benchmark imbalanced datasets from the KEEL repository. These classifiers are BRM (Bagging-Random Miner), OCKRA (One-Class K-means with Randomly-projected features Algorithm), ISOF (Isolation Forest), and ocSVM (One-Class Support Vector Machine). It is shown by using the area under the curve of the receiver operating characteristic curve as the performance measure, QMS22 significantly outperforms ISOF and ocSVM. Moreover, the Wilcoxon signed-rank tests reveal that there is no statistically significant difference when testing QMS22 against BRM nor QMS22 against OCKRA.