论文标题
使用监督的非参数合奏方法对高维数据进行二进制分类
Binary Classification for High Dimensional Data using Supervised Non-Parametric Ensemble Method
论文作者
论文摘要
用于分类的高维数据确实为机器学习算法带来了许多困难。可以使用集合学习方法(例如基于装袋的监督非参数随机森林算法)进行概括。在本文中,我们使用多囊卵巢综合征数据集解决了高维数据的二元分类问题。我们已经执行了实现,并提供了数据的详细可视化以进行一般推断。我们实现的训练准确性为95.6%,验证精度分别超过91.74%。
High dimensional data for classification does create many difficulties for machine learning algorithms. The generalization can be done using ensemble learning methods such as bagging based supervised non-parametric random forest algorithm. In this paper we solve the problem of binary classification for high dimensional data using random forest for polycystic ovary syndrome dataset. We have performed the implementation and provided a detailed visualization of the data for general inference. The training accuracy that we have achieved is 95.6% and validation accuracy over 91.74% respectively.