学习通过机器教学阅读

论文标题

学习通过机器教学阅读

Learning to Read through Machine Teaching

论文作者

Sen, Ayon, Cox, Christopher R., Borkenhagen, Matthew Cooper, Seidenberg, Mark S., Zhu, Xiaojin

论文摘要

学习大声朗读单词是迈向成为读者的重要一步。由于英语拼写对应关系的不一致之处，许多孩子挣扎了这项任务。课程在教授这些模式方面有很大不同。但是，预计儿童将在有限的时间内掌握该系统（按4年级）。我们使用认知上有趣的神经网络架构来检查学习试验的顺序是否可以构建以促进学习。即使对于适度的学习试验（例如10K），这也是一个硬组合优化问题。我们展示了如何在变化的分布中，即在训练中不同步骤中定义概率分布，即在变化的分布上进行优化。然后，我们使用随机梯度下降来找到最佳的时变分布和相应的最佳训练序列。我们观察到与基线条件（随机序列；由单词频率偏差）相比，概括精度的显着提高。这些发现提出了一种改善域中学习成果的方法，在该领域中，性能取决于能力超出有限的培训经验的能力。

Learning to read words aloud is a major step towards becoming a reader. Many children struggle with the task because of the inconsistencies of English spelling-sound correspondences. Curricula vary enormously in how these patterns are taught. Children are nonetheless expected to master the system in limited time (by grade 4). We used a cognitively interesting neural network architecture to examine whether the sequence of learning trials could be structured to facilitate learning. This is a hard combinatorial optimization problem even for a modest number of learning trials (e.g., 10K). We show how this sequence optimization problem can be posed as optimizing over a time varying distribution i.e., defining probability distributions over words at different steps in training. We then use stochastic gradient descent to find an optimal time-varying distribution and a corresponding optimal training sequence. We observed significant improvement on generalization accuracy compared to baseline conditions (random sequences; sequences biased by word frequency). These findings suggest an approach to improving learning outcomes in domains where performance depends on ability to generalize beyond limited training experience.

下载PDF全文

下载文献需遵守相关版权规定

论文标题