论文标题
用脑机界面区分学习规则
Distinguishing Learning Rules with Brain Machine Interfaces
论文作者
论文摘要
尽管对生物学上合理的学习规则进行了广泛的理论工作,但很难获得有关在大脑中是否以及如何实施此类规则的明确证据。我们考虑在生物学上具有合理的监督和加强学习规则,并询问学习过程中网络活动的变化是否可以用于确定正在使用哪种学习规则。有监督的学习需要一个估计从神经活动到行为的映射的信用分配模型,在生物生物体中,该模型将不可避免地是理想映射的不完善的近似,从而导致相对于真实梯度的重量更新方向偏见。另一方面,强化学习不需要信用分配模型,并且倾向于按照真正的梯度方向进行体重更新。我们得出一个指标,通过观察学习过程中网络活动的变化来区分学习规则,鉴于实验者已经知道了从大脑到行为的映射。由于大脑机器界面(BMI)实验允许对此映射进行精确了解,因此我们使用复发性神经网络对光标控制BMI任务进行建模,这表明只能使用神经科学实验者可以完全可以使用的观察结果来区分学习规则。
Despite extensive theoretical work on biologically plausible learning rules, clear evidence about whether and how such rules are implemented in the brain has been difficult to obtain. We consider biologically plausible supervised- and reinforcement-learning rules and ask whether changes in network activity during learning can be used to determine which learning rule is being used. Supervised learning requires a credit-assignment model estimating the mapping from neural activity to behavior, and, in a biological organism, this model will inevitably be an imperfect approximation of the ideal mapping, leading to a bias in the direction of the weight updates relative to the true gradient. Reinforcement learning, on the other hand, requires no credit-assignment model and tends to make weight updates following the true gradient direction. We derive a metric to distinguish between learning rules by observing changes in the network activity during learning, given that the mapping from brain to behavior is known by the experimenter. Because brain-machine interface (BMI) experiments allow for precise knowledge of this mapping, we model a cursor-control BMI task using recurrent neural networks, showing that learning rules can be distinguished in simulated experiments using only observations that a neuroscience experimenter would plausibly have access to.