最佳扩展社区规则$ K $最近的邻居合奏

论文标题

最佳扩展社区规则$ K $最近的邻居合奏

Optimal Extended Neighbourhood Rule $k$ Nearest Neighbours Ensemble

论文作者

Ali, Amjad, Khan, Zardad, Khan, Dost Muhammad, Aldahmani, Saeed

论文摘要

传统的K最近的邻居（KNN）方法使用球形区域内的距离公式来确定k最接近测试样品点的训练观测值。但是，当测试点位于该区域之外时，这种方法可能无法正常工作。此外，由于高分类错误，汇总许多基础KNN学习者可能会导致合奏性能差。为了解决这些问题，本文提出了一种新的最佳扩展社区规则集合方法。该规则从最接近的样本点到看不见的观察点，然后选择后续的数据点，直到达到所需的观测值。每个基本模型都是在具有随机特征子集的引导程序样本上构建的，并且在构建足够数量的模型后，基于袋外的性能选择了最佳模型。使用精确度，Cohen's Kappa和Brier分数（BS）将提出的合奏与17个基准数据集上的最新方法进行了比较。还通过在原始数据中添加人为的特征来评估所提出的方法的性能。

The traditional k nearest neighbor (kNN) approach uses a distance formula within a spherical region to determine the k closest training observations to a test sample point. However, this approach may not work well when test point is located outside this region. Moreover, aggregating many base kNN learners can result in poor ensemble performance due to high classification errors. To address these issues, a new optimal extended neighborhood rule based ensemble method is proposed in this paper. This rule determines neighbors in k steps starting from the closest sample point to the unseen observation and selecting subsequent nearest data points until the required number of observations is reached. Each base model is constructed on a bootstrap sample with a random subset of features, and optimal models are selected based on out-of-bag performance after building a sufficient number of models. The proposed ensemble is compared with state-of-the-art methods on 17 benchmark datasets using accuracy, Cohen's kappa, and Brier score (BS). The performance of the proposed method is also assessed by adding contrived features in the original data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题