论文标题

从讨论板文本数据集中提取自动代码

Automated Code Extraction from Discussion Board Text Dataset

论文作者

Saravani, Sina Mahdipour, Ghaffari, Sadaf, Luther, Yanye, Folkestad, James, Moraes, Marcia

论文摘要

这项研究介绍并研究了三种不同文本挖掘方法的功能,即潜在的语义分析,潜在的dirichlet分析和群集词向量,以从相对较小的讨论板数据集中自动化代码提取。我们将每种算法的输出与由两个人类评估者手动编码的先前数据集进行比较。结果表明,即使使用相对较小的数据集,自动化方法也可以通过提取一些讨论代码来成为课程讲师的资产,这些讨论代码可以在认知网络分析中使用。

This study introduces and investigates the capabilities of three different text mining approaches, namely Latent Semantic Analysis, Latent Dirichlet Analysis, and Clustering Word Vectors, for automating code extraction from a relatively small discussion board dataset. We compare the outputs of each algorithm with a previous dataset that was manually coded by two human raters. The results show that even with a relatively small dataset, automated approaches can be an asset to course instructors by extracting some of the discussion codes, which can be used in Epistemic Network Analysis.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源