论文标题
重新研究入门数据科学
A fresh look at introductory data science
论文作者
论文摘要
本质上大而复杂的大量可用数据集的扩散使大学挑战了对在统计学和计算技能方面接受培训的毕业生的需求,以有效地计划,获取,管理,分析,分析和传达此类数据的发现。为了满足这一需求,请尽早吸引学生进入数据科学,并为他们提供稳固的企业涉足该领域的努力变得越来越重要。我们介绍了旨在满足这些需求的数据科学入门本科课程的案例研究。该课程在杜克大学(Duke University)提供的课程没有先决条件,为有抱负的统计学和数据科学专业以及人文,社会科学和自然科学专业的学生提供了广泛的观众。我们讨论了通过提供这样的课程和这些挑战所带来的独特挑战集,我们对教学设计元素,内容,结构,计算基础架构以及课程的评估方法进行了详细讨论。我们还提供了一个存储库,其中包含所有开源的教材,以及补充材料以及繁殖论文中发现的数字的R守则。
The proliferation of vast quantities of available datasets that are large and complex in nature has challenged universities to keep up with the demand for graduates trained in both the statistical and the computational set of skills required to effectively plan, acquire, manage, analyze, and communicate the findings of such data. To keep up with this demand, attracting students early on to data science as well as providing them a solid foray into the field becomes increasingly important. We present a case study of an introductory undergraduate course in data science that is designed to address these needs. Offered at Duke University, this course has no pre-requisites and serves a wide audience of aspiring statistics and data science majors as well as humanities, social sciences, and natural sciences students. We discuss the unique set of challenges posed by offering such a course and in light of these challenges, we present a detailed discussion into the pedagogical design elements, content, structure, computational infrastructure, and the assessment methodology of the course. We also offer a repository containing all teaching materials that are open-source, along with supplemental materials and the R code for reproducing the figures found in the paper.