论文标题
使用Autosas进行评分 - 一种自动化系统,用于得分短暂答案
Get It Scored Using AutoSAS -- An Automated System for Scoring Short Answers
论文作者
论文摘要
在MOOC时代,在线考试是由数百万候选人进行的,那里的简短答案是不可或缺的一部分。通过人类分级者评估它们变得很棘手。因此,应设计和部署能够对这些响应进行分级的通用自动化系统。在本文中,我们提出了一种快速,可扩展且准确的方法来实现自动化的短答案评分(SAS)。我们建议并解释SAS系统系统的设计和开发,即汽车。考虑到一个问题以及其分级样本,Autosas可以学会成功地对其进行分级。本文进一步介绍了诸如词汇多样性,Word2VEC,及时和内容重叠之类的功能,这些功能在构建我们提出的模型中起着关键作用。我们还提出了一种方法来指示负责评分答案的因素。在广泛使用的公共数据集上评估了训练有素的模型,即自动化的学生评估奖短答案评分(ASAP-SAS)。在某些问题提示中,Autosas显示出最先进的性能,并通过二次加权Kappa(QWK)衡量的某些问题提示,以超过8%的成绩,显示出与人类相当的性能。
In the era of MOOCs, online exams are taken by millions of candidates, where scoring short answers is an integral part. It becomes intractable to evaluate them by human graders. Thus, a generic automated system capable of grading these responses should be designed and deployed. In this paper, we present a fast, scalable, and accurate approach towards automated Short Answer Scoring (SAS). We propose and explain the design and development of a system for SAS, namely AutoSAS. Given a question along with its graded samples, AutoSAS can learn to grade that prompt successfully. This paper further lays down the features such as lexical diversity, Word2Vec, prompt, and content overlap that plays a pivotal role in building our proposed model. We also present a methodology for indicating the factors responsible for scoring an answer. The trained model is evaluated on an extensively used public dataset, namely Automated Student Assessment Prize Short Answer Scoring (ASAP-SAS). AutoSAS shows state-of-the-art performance and achieves better results by over 8% in some of the question prompts as measured by Quadratic Weighted Kappa (QWK), showing performance comparable to humans.