论文标题
多机构信息学习过程
Multi-Agent Informational Learning Processes
论文作者
论文摘要
我们介绍了一个新的数学模型,该模型是多代理增强学习,即多代理信息学习处理器“ MailP”模型。该模型基于以下概念:代理具有一定量信息的策略,模型该信息如何迭代地演变并传播许多代理。该模型非常通用,唯一有意义的假设是,随着时间的流逝,对个别代理的学习会逐渐减慢。
We introduce a new mathematical model of multi-agent reinforcement learning, the Multi-Agent Informational Learning Processor "MAILP" model. The model is based on the notion that agents have policies for a certain amount of information, models how this information iteratively evolves and propagates through many agents. This model is very general, and the only meaningful assumption made is that learning for individual agents progressively slows over time.