论文标题
在经常出现的尖峰网络中实现生物学上合理的梦和计划
Towards biologically plausible Dreaming and Planning in recurrent spiking networks
论文作者
论文摘要
练习几个小时后,人类和动物可以学习新技能,而当前的加固学习算法则需要大量数据才能实现良好的表现。最近的基于模型的方法通过减少与环境的必要互动数量来学习理想的政策,从而显示出令人鼓舞的结果。但是,这些方法需要生物学上令人难以置信的成分,例如旧经验的详细存储以及长期的离线学习。学习和利用单词模型的最佳方法仍然是一个悬而未决的问题。从生物学中汲取灵感,我们建议做梦可能是使用内部模型的有效权宜之计。我们提出了一个两模块(代理和模型)尖峰神经网络,其中“梦想”(基于模型的模拟环境中的新体验)大大促进了学习。我们还探索了“计划”,这是梦dream以求的在线替代方案,显示出可比的表演。重要的是,我们的模型不需要详细的经验存储,而是在线学习世界模型和政策。此外,我们强调的是,我们的网络由尖峰神经元组成,进一步提高了神经形态硬件的生物学合理性和可实现性。
Humans and animals can learn new skills after practicing for a few hours, while current reinforcement learning algorithms require a large amount of data to achieve good performances. Recent model-based approaches show promising results by reducing the number of necessary interactions with the environment to learn a desirable policy. However, these methods require biological implausible ingredients, such as the detailed storage of older experiences, and long periods of offline learning. The optimal way to learn and exploit word-models is still an open question. Taking inspiration from biology, we suggest that dreaming might be an efficient expedient to use an inner model. We propose a two-module (agent and model) spiking neural network in which "dreaming" (living new experiences in a model-based simulated environment) significantly boosts learning. We also explore "planning", an online alternative to dreaming, that shows comparable performances. Importantly, our model does not require the detailed storage of experiences, and learns online the world-model and the policy. Moreover, we stress that our network is composed of spiking neurons, further increasing the biological plausibility and implementability in neuromorphic hardware.