论文标题
多级教育数据挖掘的多级优化装袋合奏模型选择
Multi-split Optimized Bagging Ensemble Model Selection for Multi-class Educational Data Mining
论文作者
论文摘要
近年来,预测学生的学习成绩一直是一个有趣的研究领域,许多机构着重于提高学生的表现和教育质量。可以使用各种数据挖掘技术来实现学生表现的分析和预测。此外,这种技术使讲师能够确定可能影响学生最终标记的可能因素。为此,这项工作分析了两所不同大学的两个不同的本科数据集。此外,这项工作旨在预测学生在两个阶段的表现(分别为20%和50%)。该分析允许正确选择适当的机器学习算法,并优化算法的参数。此外,这项工作采用了基于Gini索引和p值的系统多分数方法。这是通过优化根据六种潜在的基础机器学习算法组合而构建的合适的包装合奏学习者来完成的。通过实验结果表明,所提出的包装集合模型可为两个数据集的目标组具有很高的精度。
Predicting students' academic performance has been a research area of interest in recent years with many institutions focusing on improving the students' performance and the education quality. The analysis and prediction of students' performance can be achieved using various data mining techniques. Moreover, such techniques allow instructors to determine possible factors that may affect the students' final marks. To that end, this work analyzes two different undergraduate datasets at two different universities. Furthermore, this work aims to predict the students' performance at two stages of course delivery (20% and 50% respectively). This analysis allows for properly choosing the appropriate machine learning algorithms to use as well as optimize the algorithms' parameters. Furthermore, this work adopts a systematic multi-split approach based on Gini index and p-value. This is done by optimizing a suitable bagging ensemble learner that is built from any combination of six potential base machine learning algorithms. It is shown through experimental results that the posited bagging ensemble models achieve high accuracy for the target group for both datasets.